API reference

Initializes the Classifications object.

method

Creates a new instance of the Classifications class and copies the attributes.  @return: A new instance of the Classifications class. @rtype: Classifications

property

classes

Returns the list of classes.  @return: List of classes. @rtype: List[str]

method

classes.setter(self, value: List

[



str

])

Sets the classes.  @param value: A list of class names. @type value: List[str] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type string.

property

scores

Returns the list of scores.  @return: List of scores. @rtype: NDArray[np.float32]

method

scores.setter(self, value: NDArray

[



np.float32

])

Sets the scores.  @param value: A list of scores. @type value: NDArray[np.float32] @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 1D numpy array. @raise ValueError: If each element is not of type float.

property

top_class

Returns the most probable class. Only works if classes are sorted by scores.  @return: The top class. @rtype: str

property

top_score

Returns the probability of the most probable class. Only works if scores are sorted by descending order.  @return: The top score. @rtype: float

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations

Returns default visualization message for classification.  The message adds the top five classes and their scores to the right side of the image.

class

depthai_nodes.message.Cluster(depthai.Buffer)

method

Initializes the Cluster object.

method

label.setter(self, value: int)

Creates a new instance of the Cluster class and copies the attributes.  @return: A new instance of the Cluster class. @rtype: Cluster

property

label

Returns the label of the cluster.  @return: Label of the cluster. @rtype: int

method

Sets the label of the cluster.  @param value: Label of the cluster. @type value: int @raise TypeError: If value is not an int.

property

points

Returns the points in the cluster.  @return: List of points in the cluster. @rtype: List[dai.Point2f]

method

points.setter(self, value: List

[



dai.Point2f

])

Sets the points in the cluster.  @param value: List of points in the cluster. @type value: List[dai.Point2f] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type dai.Point2f.

class

depthai_nodes.message.Clusters(depthai.Buffer)

method

Initializes the Clusters object.

method

Creates a new instance of the Clusters class and copies the attributes.  @return: A new instance of the Clusters class. @rtype: Clusters

property

clusters

Returns the clusters.  @return: List of clusters. @rtype: List[Cluster]

method

clusters.setter(self, value: List

[



Cluster

])

Sets the clusters.  @param value: List of clusters. @type value: List[Cluster] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type Cluster.

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations

Creates a default visualization message for clusters and colors each one separately.

class

depthai_nodes.message.Collection(depthai.Buffer, typing.Generic)

method

__init__(self, items: List

[



T

])

variable

items

property

item_cls

The inferred runtime type for items, once known.

method

items.setter(self, value: List

[



T

])

method

append(self, item: T)

method

extend(self, items: List

[



T

])

method

copy(self) -> Collection: Collection

class

depthai_nodes.message.GatheredData(depthai_nodes.message.Collection, typing.Generic)

method

__init__(self, reference_data: TReference, items: List

[



TGathered

])

Initializes the GatheredData object.

property

reference_data

Returns the reference data.  @return: Reference data. @rtype: TReference

method

reference_data.setter(self, value: TReference)

Sets the reference data.  @param value: Reference data. @type value: TReference

class

depthai_nodes.message.Keypoints(depthai.Buffer)

method

method

keypoints_list.setter(self, value: dai.KeypointsList)

property

keypoints_list

method

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

method

method

getKeypoints(self) -> list[dai.Keypoint]: list[dai.Keypoint]

method

setKeypoints(self, value: list

[



dai.Keypoint

])

method

getEdges(self) -> list[tuple[int, int]]: list[tuple[int, int]]

method

setEdges(self, value: list

[



tuple

[



int

, 



int

]

])

method

getPoints2f(self) -> dai.VectorPoint2f: dai.VectorPoint2f

method

getPoints3f(self) -> list[dai.Point3f]: list[dai.Point3f]

method

getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations

class

depthai_nodes.message.Line(depthai.Buffer)

method

Initializes the Line object.

method

start_point.setter(self, value: dai.Point2f)

Creates a new instance of the Line class and copies the attributes.  @return: A new instance of the Line class. @rtype: Line

property

start_point

Returns the start point of the line.  @return: Start point of the line. @rtype: dai.Point2f

method

Sets the start point of the line.  @param value: Start point of the line. @type value: dai.Point2f @raise TypeError: If value is not of type dai.Point2f.

property

end_point

Returns the end point of the line.  @return: End point of the line. @rtype: dai.Point2f

method

end_point.setter(self, value: dai.Point2f)

Sets the end point of the line.  @param value: End point of the line. @type value: dai.Point2f @raise TypeError: If value is not of type dai.Point2f.

property

confidence

Returns the confidence of the line.  @return: Confidence of the line. @rtype: float

method

confidence.setter(self, value: float)

Sets the confidence of the line.  @param value: Confidence of the line. @type value: float @raise TypeError: If value is not a float. @raise ValueError: If value is not between 0 and 1.

class

depthai_nodes.message.Lines(depthai.Buffer)

method

Initializes the Lines object.

method

Creates a new instance of the Lines class and copies the attributes.  @return: A new instance of the Lines class. @rtype: Lines

property

lines

Returns the lines.  @return: List of lines. @rtype: List[Line]

method

lines.setter(self, value: List

[



Line

])

Sets the lines.  @param value: List of lines. @type value: List[Line] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type Line.

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations

Returns default visualization message for lines.  The message adds lines to the image.

class

depthai_nodes.message.Map2D(depthai.Buffer)

method

Initializes the Map2D object.

method

map.setter(self, value: np.ndarray)

Creates a new instance of the Map2D class and copies the attributes.  @return: A new instance of the Map2D class. @rtype: Map2D

property

map

Returns the 2D map.  @return: 2D map. @rtype: NDArray[np.float32]

method

Sets the 2D map.  @param value: 2D map. @type value: NDArray[np.float32] @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 2D numpy array. @raise ValueError: If each element is not of type float.

property

width

Returns the 2D map width.  @return: 2D map width. @rtype: int

property

height

Returns the 2D map height.  @return: 2D map height. @rtype: int

variable

setTransformation(self, transformation: dai.ImgTransformation)

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgFrame: dai.ImgFrame

Returns default visualization message for 2D maps in the form of a colormapped image.

class

depthai_nodes.message.Prediction(depthai.Buffer)

method

Initializes the Prediction object.

method

prediction.setter(self, value: float)

Creates a new instance of the Prediction class and copies the attributes.  @return: A new instance of the Prediction class. @rtype: Prediction

property

prediction

Returns the prediction.  @return: The predicted value. @rtype: float

method

Sets the prediction.  @param value: The predicted value. @type value: float @raise TypeError: If value is not of type float.

class

depthai_nodes.message.Predictions(depthai.Buffer)

method

Initializes the Predictions object.

method

Creates a new instance of the Predictions class and copies the attributes.  @return: A new instance of the Predictions class. @rtype: Predictions

property

predictions

Returns the predictions.  @return: List of predictions. @rtype: List[Prediction]

method

predictions.setter(self, value: List

[



Prediction

])

Sets the predictions.  @param value: List of predicted values. @type value: List[Prediction] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type Prediction.

property

prediction

Returns the first prediction. Useful for single predictions.  @return: The predicted value. @rtype: float

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations

Returns the visualization message for the predictions.  The message adds text representing the predictions to the right of the image.

class

depthai_nodes.message.SegmentationMask(depthai.Buffer)

method

Initializes the SegmentationMask object.

method

Creates a new instance of the SegmentationMask class and copies the attributes.  @return: A new instance of the SegmentationMask class. @rtype: SegmentationMask

property

mask

Returns the segmentation mask.  @return: Segmentation mask. @rtype: NDArray[np.int16]

method

mask.setter(self, value: NDArray

[



np.int16

])

Sets the segmentation mask.  @param value: Segmentation mask. @type value: NDArray[np.int16]) @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 2D numpy array. @raise ValueError: If each element is not of type int16. @raise ValueError: If any element is smaller than -1.

variable

getTransformation(self) -> Optional[dai.ImgTransformation]: Optional[dai.ImgTransformation]

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

transformation.setter(self, value: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

setTransformation(self, transformation: Optional

[



dai.ImgTransformation

])

Sets the Image Transformation object.  @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.

method

Returns the Image Transformation object.  @return: The Image Transformation object. @rtype: dai.ImgTransformation

method

getVisualizationMessage(self) -> dai.ImgFrame: dai.ImgFrame

Returns the default visualization message for segmentation masks.

class

depthai_nodes.message.SnapData(depthai.Buffer)

method

__init__(self, snap_name: str, file_group: dai.FileGroup, tags: Optional

[



List

[



str

]

] = None, extras: Optional

[



Dict

[



str

, 



str

]

] = None)

variable

snap_name

variable

file_group

variable

depthai_nodes.node

module

apply_colormap

module

apply_depth_colormap

module

base_host_node

module

base_threaded_host_node

module

coordinates_mapper

module

depth_merger

module

extended_neural_network

module

frame_cropper

module

gather_data

module

host_parsing_neural_network

module

host_spatials_calc

module

img_detections_filter

module

img_frame_overlay

module

instance_to_semantic_mask

module

message_collector

module

parser_generator

package

parsers

module

parsing_neural_network

module

module

package

class

A host node that applies a colormap to a 2D array (e.g. depth maps, segmentation masks, heatmaps, etc.).  This node is generic and uses per-frame max-value normalization. For depth visualization prefer 'ApplyDepthColormap' to avoid flicker caused by the changing normalization range.  Parameters ---------- colormapValue : Union[int, np.ndarray], optional     OpenCV colormap enum (e.g. cv2.COLORMAP_JET) or a custom OpenCV-compatible     colormap LUT. Default is cv2.COLORMAP_JET. maxValue : int, optional     Maximum value used for normalization. If set to 0, the maximum value     is determined per-frame. Default is 0.  Inputs ------ frame : dai.ImgFrame | Map2D | dai.ImgDetections | SegmentationMask     Input message containing a 2D array to be colorized.  Outputs ------- output : dai.ImgFrame     Colorized output frame (3-channel BGR).

class

ApplyDepthColormap

A host node that applies a colormap to a depth map using percentile-based normalization to reduce flicker.  Works with RAW 2D dai.ImgFrame outputs such as stereo.depth and stereo.disparity frames. Percentile normalization is typically more beneficial for stereo.depth since disparity often has a fixed output range.  Invalid depth values (<= 0) are ignored when computing percentiles and are rendered as black in the output.  Parameters ---------- colormapValue : Union[int, np.ndarray], optional     OpenCV colormap enum (e.g. cv2.COLORMAP_JET) or a custom OpenCV-compatible     colormap LUT. Default is cv2.COLORMAP_JET. pLow : float, optional     Lower normalization percentile in [0, 100). Default 2.0. pHigh : float, optional     Upper normalization percentile in (0, 100]. Default 98.0.  Inputs ------ frame : dai.ImgFrame     Input message containing a 2D array to be colorized.  Outputs ------- output : dai.ImgFrame     Colorized output frame (3-channel BGR).

class

BaseHostNode

An abstract base class for host nodes.  Designed to encapsulate and abstract the configuration of platform-specific attributes, providing a clean and consistent interface for derived classes.

class

BaseThreadedHostNode

An abstract base class for threaded host nodes.  Designed to encapsulate and abstract the configuration of platform-specific attributes, providing a clean and consistent interface for derived classes.

class

CoordinatesMapper

Threaded host node that remaps message coordinates into a cached reference frame. This is a temporary node, this functionality will be added to the ImageAlign depthai node. The node takes two inputs: - a **target transformation** stream used to establish and update the cached reference frame, - a message stream whose coordinates should be remapped. Any DepthAI message that provides a ``getTransformation()`` and a ``setTransformation()`` method can be remapped. Internally, coordinate fields are transformed from the message’s original reference frame into the target reference frame. On-device, a lightweight Script node extracts only the :class:`dai.ImgTransformation` from incoming messages and forwards it to the host. This avoids transferring large image payloads and reduces host–device bandwidth usage. The first target transformation message is required before any source messages can be remapped. After that, the node keeps using the cached transformation and updates it only when a newer target message is available via ``tryGet()``. Message groups are handled recursively: each contained message is remapped individually while preserving timestamps and sequence numbers. Notes ----- - Messages that do not support coordinate remapping are passed through unchanged. - The output message always carries the target transformation as its transformation. - This node is currently **not supported on RVC2**. Inputs ------ toTransformationInput : dai.Node.Output Output producing messages that define the target reference frame. Only the transformation is extracted on-device. fromTransformationInput : dai.Node.Output Output producing messages whose coordinates should be remapped. Outputs ------- out : dai.Node.Output Messages with coordinates remapped into the target reference frame. Raises ------ RuntimeError If used on an unsupported platform (RVC2), or if the target transformation cannot be obtained from the input message.

class

DepthMerger

DepthMerger is a custom host node for merging 2D detections with depth information to produce spatial detections.  Attributes ---------- output : dai.Node.Output     The output of the DepthMerger node containing spatial detections. shrinkingFactor : float     The percentage of the bounding box to shrink from each side before     sampling depth.  Usage ----- depth_merger = pipeline.create(DepthMerger).build(     output2d=nn.out,     outputDepth=stereo.depth )

class

ExtendedNeuralNetwork

A high-level host node that performs neural network inference with automatic input resizing and optional coordinate remapping. `ExtendedNeuralNetwork` is a convenience wrapper around an internal :class:`ParsingNeuralNetwork` node. It handles: - Model loading from HubAI slug, :class:`dai.NNModelDescription`, or :class:`dai.NNArchive`. - Automatic input resizing to match the neural network input resolution. - Optional coordinate remapping when the input is not a camera node. Two input modes are supported: - **Camera input**: When `inputImage` is a :class:`dai.node.Camera`, the node requests a resized output directly from the camera using the appropriate hardware resize mode. In this case, the neural network outputs are already aligned with the original image coordinates and no additional mapping is required. - **Generic stream input**: When `inputImage` is a :class:`dai.Node.Output`, an internal :class:`dai.node.ImageManip` node resizes frames to the network's expected input size. A :class:`CoordinatesMapper` node is then inserted to map neural network outputs back to the original image coordinate space. The node exposes neural network outputs via :attr:`out`, and passthrough frames via :attr:`passthrough`. Notes ----- - This node is currently not supported on the RVC2 platform. - When a non-camera input is used, an additional ImageManip node is inserted into the pipeline. - Coordinate remapping is performed automatically when resizing occurs outside of a camera node. Outputs ------- out : dai.Node.Output Parsed neural network output stream. If coordinate remapping is required, this stream contains remapped results. outputs : dai.Node.Output Alias for :attr:`out` or the raw neural network outputs, depending on input mode. passthrough : dai.Node.Output Passthrough stream from the underlying neural network node. See Also -------- ParsingNeuralNetwork Node responsible for running inference and parsing results. CoordinatesMapper Node used to remap output coordinates when resizing is applied. dai.node.ImageManip Node used for resizing when input is not a camera node.

class

FrameCropper

A host node that crops detection regions from frames and outputs one cropped :class:`dai.ImgFrame` per region. `FrameCropper` is a convenience wrapper around an internal :class:`dai.node.ImageManip` configured for cropping + resizing. It supports two input modes: - **fromImgDetections**: Provide :class:`dai.ImgDetections` and the node will generate :class:`dai.ImageManipConfig` messages for each detection via a :class:`dai.node.Script` node. Each config is paired with the corresponding input frame, producing one cropped output frame per detection. - **fromManipConfigs**: Provide an upstream stream of cropping configs packed in :class:`dai.MessageGroup` messages. An on-device :class:`dai.node.Script` node pairs each config with the current frame and forwards them to the internal :class:`dai.node.ImageManip`. Configuration is provided via :meth:`fromImgDetections` or :meth:`fromManipConfigs`. The pipeline nodes are constructed only once :meth:`build` is called. Notes ----- - Exactly one configuration path must be selected: only one of :meth:`fromImgDetections` and :meth:`fromManipConfigs` can be used. - Output frames are always resized to `outputSize` using the provided `resizeMode` (default: ``CENTER_CROP``). - In `fromImgDetections` mode, a :class:`dai.node.Script` node drives the cropping by emitting one :class:`dai.ImageManipConfig` per detection. - In `fromManipConfigs` mode, the `inputManipConfigs` stream **must** output :class:`dai.MessageGroup` messages where each value is a :class:`dai.ImageManipConfig`. Key naming is arbitrary. Parameters ---------- fromImgDetections(padding=0.0) Optional padding factor applied around each detection region. build(outputSize, resizeMode) Sets the crop output size and the resize mode used by ImageManip. Outputs ------- out : dai.Node.Output Stream of cropped :class:`dai.ImgFrame` messages. One output frame is produced per crop configuration (per detection in `fromImgDetections` mode; per config in the received `MessageGroup` in `fromManipConfigs` mode). See Also -------- dai.node.ImageManip Node used to perform cropping and resizing. dai.ImageManipConfig Cropping configuration messages forwarded to ImageManip. dai.ImgDetections Detection message type used in `fromImgDetections` mode. dai.MessageGroup Message type expected by `fromManipConfigs`.

class

GatherData

Threaded host node that groups (“gathers”) multiple data messages around a single reference message, matched by timestamp.  The node receives two input streams:  - **reference_input**: reference messages (e.g., detections) that define a   grouping key (timestamp) and determine how many data items should be   gathered for that reference. - **data_input**: messages to be collected for the nearest reference timestamp   within a tolerance derived from the camera FPS.  For each reference timestamp, the node waits until the number of gathered data messages equals `wait_count_fn(reference)`. Once ready, it emits a :class:`depthai_nodes.GatheredData` message containing the reference message   and the gathered items.  The default `wait_count_fn` uses ``len(reference.detections)``, which works out-of-the-box for messages that expose a ``detections`` attribute (e.g. ``dai.ImgDetections``).  Notes ----- - Timestamp matching uses ``Buffer.getTimestamp().total_seconds()`` and a   tolerance of ``1 / (camera_fps * FPS_TOLERANCE_DIVISOR)``. - If ``wait_count_fn(reference) == 0``, the node emits immediately for that   reference (with an empty items list). - The node periodically polls inputs using ``tryGet()`` at a rate derived   from ``camera_fps`` and ``INPUT_CHECKS_PER_FPS``.  Inputs ------ _data_input : dai.Node.Input     Stream of data messages to be gathered (type ``TGathered``). _reference_input : dai.Node.Input     Stream of reference messages used for grouping and deciding how many     items to gather (type ``TReference``).  Outputs ------- out : dai.Node.Output     Emits :class:`depthai_nodes.GatheredData` objects with:     ``reference_data`` (the matched reference) and ``items`` (list of data).  Class Attributes --------------- FPS_TOLERANCE_DIVISOR : float     Divides the per-frame time interval to compute timestamp matching tolerance.     Higher values make matching stricter. INPUT_CHECKS_PER_FPS : int     Number of polling iterations per frame interval. Effective loop sleep is     ``1 / (INPUT_CHECKS_PER_FPS * camera_fps)``.

class

HostParsingNeuralNetwork

class

HostSpatialsCalc

HostSpatialsCalc is a helper class for calculating spatial coordinates from depth data.  Attributes ---------- calibData : dai.CalibrationHandler     Calibration data handler for the device. depthAlignmentSocket : dai.CameraBoardSocket     The camera socket used for depth alignment. delta : int     The delta value for ROI calculation. Default is 5 - means 10x10 depth pixels around point for depth averaging. threshLow : int     The lower threshold for depth values. Default is 200 - means 20cm. threshHigh : int     The upper threshold for depth values. Default is 30000 - means 30m.

class

ImgDetectionsFilter

Filters out detections based on the specified criteria and outputs them as a separate message. The order of operations:     1. Filter by label/confidence/area;     2. Sort (if applicable);     3. Subset.  Attributes ---------- keepLabels(labels)     Keep only detections whose label is present in ``labels``. rejectLabels(labels)     Drop detections whose label is present in ``labels``. minConfidence(threshold)     Require detections to meet a minimum confidence. minArea(area)     Require detections to meet a minimum normalized bounding-box area. useNms(confThresh=..., iouThresh=...)     Enable non-maximum suppression after filtering. sortByConfidence(desc=True)     Sort detections by confidence before optional top-k truncation. takeFirstK(k)     Keep only the first ``k`` detections after all previous steps.

class

ImgFrameOverlay

A host node that receives two dai.ImgFrame objects and overlays them into a single one.  Attributes ---------- frame1 : dai.ImgFrame     The input message for the background frame. frame2 : dai.ImgFrame     The input message for the foreground frame. alpha: float     The weight of the background frame in the overlay. By default, the weight is 0.5         which means that both frames are represented equally in the overlay. preserveBackground: bool     If True, zero areas in the foreground frame are ignored in the output overlay frame. Default is False. out : dai.ImgFrame     The output message for the overlay frame.

class

InstanceToSemanticMask

Converts a dai.ImgDetections instance mask into a semantic mask by mapping unique instance IDs to detection class labels.  Attributes ---------- detections: dai.ImgDetections     Input detections with instance segmentation masks. out: dai.ImgDetections     Output detections with semantic segmentation masks.

class

MessageCollector

Threaded host node that collects multiple data messages from a single input matched by timestamp.  The node receives one input stream:  - **data_input**: messages to be collected.  For each reference timestamp, the node waits until the number of gathered data messages equals `wait_count_fn(reference)`. Once ready, it emits a :class:`depthai_nodes.Collection` message containing the gathered items.  The default `wait_count_fn` uses ``len(reference.detections)``, which works out-of-the-box for messages that expose a ``detections`` attribute (e.g. ``dai.ImgDetections``).  Inputs ------ _data_input : dai.Node.Input     Stream of data messages to be gathered (type ``TGathered``).  Outputs ------- out : dai.Node.Output     Emits :class:`depthai_nodes.Collection` objects with:     ``items`` (list of data).

class

ParserGenerator

General interface for instantiating parsers based on the provided model archive.  The `build` method creates parsers based on the head information stored in the NN Archive. The method then returns a dictionary of these parsers.

class

BaseParser

Base class for neural network output parsers. This class serves as a foundation for specific parser implementations used to postprocess the outputs of neural network models. Each parser is attached to a model "head" that governs the parsing process as it contains all the necessary information for the parser to function correctly. Subclasses should implement `build` method to correctly set all parameters of the parser and the `run` method to define the parsing logic.  Attributes ---------- input : Node.Input     Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. out : Node.Output     Parser sends the processed network results to this output in a form of DepthAI message. It is a linking point from which the processed network results are retrieved.

class

ClassificationParser

Postprocessing logic for Classification model.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. classes : List[str]     List of class names to be used for linking with their respective scores.     Expected to be in the same order as Neural Network's output. If not provided, the message will only return sorted scores. is_softmax : bool = True     If False, the scores are converted to probabilities using softmax function.  Output Message/s ---------------- **Type** : Classifications(dai.Buffer)  **Description**: An object with attributes `classes` and `scores`. `classes` is a list of classes, sorted in descending order of scores. `scores` is a list of corresponding scores.

class

ClassificationSequenceParser

Postprocessing logic for a classification sequence model. The model predicts the classes multiple times and returns a list of predicted classes, where each item corresponds to the relative step in the sequence. In addition to time series classification, this parser can also be used for text recognition models where words can be interpreted as a sequence of characters (classes).  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. classes: List[str]     List of available classes for the model. is_softmax: bool     If False, the scores are converted to probabilities using softmax function. ignored_indexes: List[int]     List of indexes to ignore during classification generation (e.g., background class, blank space). remove_duplicates: bool     If True, removes consecutive duplicates from the sequence. concatenate_classes: bool     If True, concatenates consecutive words based on the predicted spaces.  Output Message/s ---------------- **Type**: Classifications(dai.Buffer)  **Description**:     An object with attributes `classes` and `scores`. `classes` is a list containing the predicted classes. `scores` is a list of corresponding probability scores.

class

DetectionParser

Parser class for parsing the output of a "general" detection model. The parser expects the output of the model to have two tensors: one for bounding boxes and one for scores. Tensor for bboxes should be of shape (N, 4) and scores should be of shape (N,). Bboxes are expected to be in the format [xmin, ymin, xmax, ymax]. If this is not the case you can check other parsers or create a new one. As the result, the node sends out the detected objects in the form of a message containing bounding boxes and confidence scores.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. conf_threshold : float     Confidence score threshold of detected bounding boxes. iou_threshold : float     Non-maximum suppression threshold. max_det : int     Maximum number of detections to keep. label_names : List[str]     List of label names for detected objects.  Output Message/s     -------     **Type**: dai.ImgDetections      **Description**: dai.ImgDetections message containing bounding boxes and confidence scores of detected objects. ----------------

class

EmbeddingsParser

Parser class for parsing the output of embeddings neural network model head.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser.  Output Message/s ---------------- **Type**: dai.NNData  **Description**: The output layer of the neural network model head.

class

FastSAMParser

Parser class for parsing the output of the FastSAM model.  Attributes ---------- conf_threshold : float     Confidence score threshold for detected faces. n_classes : int     Number of classes in the model. iou_threshold : float     Non-maximum suppression threshold. mask_conf : float     Mask confidence threshold. prompt : str     Prompt type. points : Tuple[int, int]     Points. point_label : int     Point label. bbox : Tuple[int, int, int, int]     Bounding box. yolo_outputs : List[str]     Names of the YOLO outputs. mask_outputs : List[str]     Names of the mask outputs. protos_output : str     Name of the protos output.  Output Message/s ---------------- **Type**: SegmentationMask  **Description**: SegmentationMask message containing the resulting segmentation masks given the prompt.  Error Handling --------------

class

ImageOutputParser

Parser class for image-to-image models (e.g. DnCNN3, zero-dce etc.) where the output is a modifed image (denoised, enhanced etc.).  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. output_is_bgr : bool     Flag indicating if the output image is in BGR (Blue-Green-Red) format.  Output Message/s ------- **Type**: dai.ImgFrame  **Description**: Image message containing the output image e.g. denoised or enhanced images.  Error Handling -------------- **ValueError**: If the output is not 3- or 4-dimensional.  **ValueError**: If the number of output layers is not 1.

class

LaneDetectionParser

Parser class for Ultra-Fast-Lane-Detection model. It expects one ouput layer containing the lane detection results. It supports two versions of the model: CuLane and TuSimple. Results are representented with clusters of points.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. row_anchors : List[int]     List of row anchors. griding_num : int     Griding number. cls_num_per_lane : int     Number of points per lane. input_size : Tuple[int, int]     Input size (width,height).  Output Message/s ---------------- **Type**: Clusters **Description**: Detected lanes represented as clusters of points.  Error Handling -------------- **ValueError**: If the row anchors are not specified. **ValueError**: If the griding number is not specified. **ValueError**: If the number of points per lane is not specified.

class

MapOutputParser

A parser class for models that produce map outputs, such as depth maps (e.g. DepthAnything), density maps (e.g. DM-Count), heat maps, and similar.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. min_max_scaling : bool     If True, the map is scaled to the range [0, 1].  Output Message/s ---------------- **Type**: Map2D  **Description**: Density message containing the density map. The density map is represented with Map2D object.

class

MPPalmDetectionParser

Parser class for parsing the output of the Mediapipe Palm detection model. As the result, the node sends out the detected hands in the form of a message containing bounding boxes, labels, and confidence scores.  Attributes ---------- output_layer_names: List[str]     Names of the output layers relevant to the parser. conf_threshold : float     Confidence score threshold for detected hands. iou_threshold : float     Non-maximum suppression threshold. max_det : int     Maximum number of detections to keep. scale : int     Scale of the input image.  Output Message/s ------- **Type**: dai.ImgDetections  **Description**: dai.ImgDetections message containing bounding boxes, labels, and confidence scores of detected hands.  See also -------- Official MediaPipe Hands solution: https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker

class

MLSDParser

Parser class for parsing the output of the M-LSD line detection model. The parser is specifically designed to parse the output of the M-LSD model. As the result, the node sends out the detected lines in the form of a message.  Attributes ---------- output_layer_tpmap : str     Name of the output layer containing the tpMap tensor. output_layer_heat : str     Name of the output layer containing the heat tensor. topk_n : int     Number of top candidates to keep. score_thr : float     Confidence score threshold for detected lines. dist_thr : float     Distance threshold for merging lines.  Output Message/s ---------------- **Type**: LineDetections  **Description**: LineDetections message containing detected lines and confidence scores.

class

PPTextDetectionParser

Parser class for parsing the output of the PaddlePaddle OCR text detection model.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. conf_threshold : float     The threshold for bounding boxes. mask_threshold : float     The threshold for the mask. max_det : int     The maximum number of candidate bounding boxes.  Output Message/s ------- **Type**: dai.ImgDetections **Description**: dai.ImgDetections message containing bounding boxes and the respective confidence scores of detected text.

class

RegressionParser

Parser class for parsing the output of a model with regression output (e.g. Age- Gender).  Attributes ---------- output_layer_name : str     Name of the output layer relevant to the parser.  Output Message/s ---------------- **Type**: Predictions  **Description**: Message containing the prediction(s).

class

SCRFDParser

Parser class for parsing the output of the SCRFD face detection model.  Attributes ---------- output_layer_name: List[str]     Names of the output layers relevant to the parser. conf_threshold : float     Confidence score threshold for detected faces. iou_threshold : float     Non-maximum suppression threshold. max_det : int     Maximum number of detections to keep. input_size : tuple     Input size of the model. feat_stride_fpn : tuple     Tuple of the feature strides. num_anchors : int     Number of anchors.  Output Message/s ---------------- **Type**: dai.ImgDetections  **Description**: dai.ImgDetections message containing bounding boxes, labels, and confidence scores of detected faces.

class

SegmentationParser

Parser class for parsing the output of the segmentation models.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. classes_in_one_layer : bool     Whether all classes are in one layer in the multi-class segmentation model. Default is False. If True, the parser will use np.max instead of np.argmax to get the class map.  Output Message/s ---------------- **Type**: SegmentationMask  **Description**: Segmentation message containing the segmentation mask. Every pixel belongs to exactly one class. Unassigned pixels are represented with "-1" and class pixels with non-negative integers.  Error Handling -------------- **ValueError**: If the number of output layers is not E{1}.  **ValueError**: If the number of dimensions of the output tensor is not E{3}.

class

XFeatMonoParser

Parser class for parsing the output of the XFeat model. It can be used for parsing the output from one source (e.g. one camera). The reference frame can be set with trigger method.  Attributes ---------- output_layer_feats : str     Name of the output layer containing features. output_layer_keypoints : str     Name of the output layer containing keypoints. output_layer_heatmaps : str     Name of the output layer containing heatmaps. original_size : Tuple[float, float]     Original image size. input_size : Tuple[float, float]     Input image size. max_keypoints : int     Maximum number of keypoints to keep. previous_results : np.ndarray     Previous results from the model. Previous results are used to match keypoints between two frames. trigger : bool     Trigger to set the reference frame.  Output Message/s ---------------- **Type**: dai.TrackedFeatures  **Description**: TrackedFeatures message containing matched keypoints with the same ID.  Error Handling -------------- **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.

class

XFeatStereoParser

Parser class for parsing the output of the XFeat model. It can be used for parsing the output from two sources (e.g. two cameras - left and right). Attributes ---------- reference_input : Node.Input Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. target_input : Node.Input Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. out : Node.Output Parser sends the processed network results to this output in a form of DepthAI message. It is a linking point from which the processed network results are retrieved. output_layer_feats : str Name of the output layer from which the features are extracted. output_layer_keypoints : str Name of the output layer from which the keypoints are extracted. output_layer_heatmaps : str Name of the output layer from which the heatmaps are extracted. original_size : Tuple[float, float] Original image size. input_size : Tuple[float, float] Input image size. max_keypoints : int Maximum number of keypoints to keep. Output Message/s ---------------- **Type**: dai.TrackedFeatures **Description**: TrackedFeatures message containing matched keypoints with the same ID. Error Handling -------------- **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.

class

YOLOExtendedParser

Parser class for parsing the output of the YOLO Instance Segmentation and Pose Estimation models.  Attributes ---------- conf_threshold : float     Confidence score threshold for detected faces. n_classes : int     Number of classes in the model. label_names : Optional[List[str]]     Names of the classes. iou_threshold : float     Intersection over union threshold. mask_conf : float     Mask confidence threshold. n_keypoints : int     Number of keypoints in the model. anchors : Optional[List[List[List[float]]]]     Anchors for the YOLO model (optional). keypoint_label_names : Optional[List[str]]     Labels for the keypoints. keypoint_edges : Optional[List[Tuple[int, int]]]     Keypoint connection pairs for visualizing the skeleton. Example: [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint 1, keypoint 1 is connected to keypoint 2, etc. subtype : str     Version of the YOLO model.  Output Message/s ---------------- **Type**: dai.ImgDetections  **Description**: dai.ImgDetections message containing bounding boxes, labels, label names, confidence scores, and keypoints or masks and protos of the detected objects.

class

YuNetParser

Parser class for parsing the output of the YuNet face detection model.  Attributes ---------- conf_threshold : float     Confidence score threshold for detected faces. iou_threshold : float     Non-maximum suppression threshold. max_det : int     Maximum number of detections to keep. input_size : Tuple[int, int]     Input size (width, height). loc_output_layer_name: str     Name of the output layer containing the location predictions. conf_output_layer_name: str     Name of the output layer containing the confidence predictions. iou_output_layer_name: str     Name of the output layer containing the IoU predictions.  Output Message/s ---------------- **Type**: dai.ImgDetections  **Description**: dai.ImgDetections message containing bounding boxes, labels, confidence scores, and keypoints of detected faces.

class

ParsingNeuralNetwork

class

SnapsUploader

Host node responsible for receiving SnapData messages and sending snaps to DepthAI Hub Events API.

class

Tiling

Produces tiling ImageManipConfig groups and supports runtime reconfiguration.  The node computes a :class:`dai.MessageGroup` of :class:`dai.ImageManipConfig` messages from the current tiling configuration. An internal Script node caches the latest config group from the ``cfg`` input using ``tryGet()`` and emits that group whenever a message arrives on the ``trigger`` input.  The main intended downstream consumer is :class:`depthai_nodes.node.FrameCropper` configured via ``fromManipConfigs``.

module

depthai_nodes.node.base_host_node

variable

HostNodeMeta

class

CombinedMeta

module

depthai_nodes.node.base_threaded_host_node

variable

ThreadedHostNodeMeta

class

CombinedMeta

module

depthai_nodes.node.gather_data

class

HasDetections

type variable

TReference

type variable

TGathered

class

depthai_nodes.node.gather_data.HasDetections(typing.Protocol)

property

detections

Return the detections used to derive the default wait count.

module

depthai_nodes.node.img_detections_filter

module

depthai_nodes.node.message_collector

type variable

TCollected

package

depthai_nodes.node.parsers

module

base_parser

module

classification

module

classification_sequence

module

module

module

module

module

module

module

module

module

mediapipe_palm_detection

module

module

module

module

module

module

superanimal_landmarker

package

module

module

module

class

Parser class for parsing the output of the HRNet pose estimation model. The code is inspired by https://github.com/ibaiGorordo/ONNX-HRNET-Human-Pose-Estimation.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. score_threshold : float     Confidence score threshold for detected keypoints. label_names: Optional[List[str]]     Label names for the keypoints. edges: Optional[List[Tuple[int, int]]]     Keypoint connection pairs for visualizing the skeleton. Example:         [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint         1, keypoint 1 is connected to keypoint 2, etc.  Output Message/s ---------------- **Type**: Keypoints  **Description**: Output containing detected body keypoints.

class

KeypointParser

Parser class for 2D or 3D keypoints models. It expects one ouput layer containing keypoints. The number of keypoints must be specified. Moreover, the keypoints are normalized by a scale factor if provided.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. scale_factor : float     Scale factor to divide the keypoints by. n_keypoints : int     Number of keypoints the model detects. score_threshold : float     Confidence score threshold for detected keypoints. label_names : List[str]     Label names for the keypoints. edges : List[Tuple[int, int]]     Keypoint connection pairs for visualizing the skeleton. Example: [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint 1, keypoint 1 is connected to keypoint 2, etc.  Output Message/s ---------------- **Type**: Keypoints  **Description**: Output containing 2D or 3D keypoints.  Error Handling -------------- **ValueError**: If the number of keypoints is not specified.  **ValueError**: If the number of coordinates per keypoint is not 2 or 3.  **ValueError**: If the number of output layers is not 1.

class

SuperAnimalParser

Parser class for parsing the output of the SuperAnimal landmark model.  Attributes ---------- output_layer_name: str     Name of the output layer relevant to the parser. scale_factor : float     Scale factor to divide the keypoints by. n_keypoints : int     Number of keypoints. score_threshold : float     Confidence score threshold for detected keypoints. label_names : List[str]     Label names for the keypoints. edges : List[Tuple[int, int]]     Keypoint connection pairs for visualizing the skeleton. Example: [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint 1, keypoint 1 is connected to keypoint 2, etc.  Output Message/s ---------------- **Type**: Keypoints  **Description**: Output containing detected keypoints that exceed the confidence threshold.

module

depthai_nodes.node.parsers.base_parser

class

BaseMeta

package

depthai_nodes.node.parsers.utils

module

activations

module

bbox_format_converters

function

decode_head(head) -> Dict[str, Any]: Dict[str, Any]

Decode head object into a dictionary containing configuration details.  @param head: The head object to decode. @type head: dai.nn_archive.v1.Head @return: A dictionary containing configuration details relevant to the head. @rtype: Dict[str, Any]

module

module

module

module

module

mediapipe.py.  Description: This script contains utility functions for decoding the output of the MediaPipe hand tracking model.  This script contains code that is based on or directly taken from a public GitHub repository: https://github.com/geaxgx/depthai_hand_tracker  Original code author(s): geaxgx  License: MIT License  Copyright (c) [2021] [geax]

module

module

module

module

module

module

module

module

module

function

sigmoid(x: np.ndarray) -> np.ndarray: np.ndarray

Sigmoid function.  @param x: Input tensor. @type x: np.ndarray @return: A result tensor after applying a sigmoid function on the given input. @rtype: np.ndarray

function

softmax(x: np.ndarray, axis: Optional

[



int

] = None, keep_dims: bool = False) -> np.ndarray: np.ndarray

Compute the softmax of an array. The softmax function is defined as: softmax(x) = exp(x) / sum(exp(x))  @param x: The input array. @type x: np.ndarray @param axis: Axis or axes along which a sum is performed. The default, axis=None,     will sum all of the elements of the input array. If axis is negative it counts     from the last to the first axis. @type axis: int @param keep_dims: If this is set to True, the axes which are reduced are left in the     result as dimensions with size one. With this option, the result will broadcast     correctly against the input array. @type keep_dims: bool @return: The softmax of the input array. @rtype: np.ndarray

function

corners_to_rotated_bbox(corners: np.ndarray) -> np.ndarray: np.ndarray

Converts the corners of a bounding box to a rotated bounding box.  @param corners: The corners of the bounding box. The corners are expected to be     ordered by top-left, top-right, bottom-right, bottom-left. @type corners: np.ndarray @return: The rotated bounding box defined as [x_center, y_center, width, height,     angle]. @rtype: np.ndarray

function

normalize_bboxes(bboxes: np.ndarray, height: int, width: int) -> np.ndarray: np.ndarray

Normalize bounding box coordinates to (0, 1).  @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param height: The height of the image. @type height: int @param width: The width of the image. @type width: int @return: A numpy array of shape (N, 4) containing the normalized bounding boxes. @type np.ndarray

function

rotated_bbox_to_corners(cx: float, cy: float, w: float, h: float, rotation: float) -> np.ndarray: np.ndarray

Converts a rotated bounding box to the corners of the bounding box.  @param cx: The x-coordinate of the center of the bounding box. @type cx: float @param cy: The y-coordinate of the center of the bounding box. @type cy: float @param w: The width of the bounding box. @type w: float @param h: The height of the bounding box. @type h: float @param rotation: The angle of the bounding box given in degrees. @type rotation: float @return: The corners of the bounding box. @rtype: np.ndarray

function

top_left_wh_to_xywh(bboxes: np.ndarray) -> np.ndarray: np.ndarray

Converts bounding boxes from [top_left_x, top_left_y, width, height] to [x_center, y_center, width, height].  @param bboxes: The bounding boxes to convert. @type bboxes: np.ndarray @return: The converted bounding boxes. @rtype: np.ndarray

function

xywh_to_xyxy(bboxes: np.ndarray) -> np.ndarray: np.ndarray

Convert bounding box coordinates from (x_center, y_center, width, height) to (x_min, y_min, x_max, y_max).  @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes in (x, y, width, height) format. @type np.ndarray @return: A numpy array of shape (N, 4) containing the bounding boxes in (x_min, y_min, x_max, y_max) format. @type np.ndarray

function

xyxy_to_xywh(bboxes: np.ndarray) -> np.ndarray: np.ndarray

Converts bounding boxes from [x_min, y_min, x_max, y_max] to [x_center, y_center, width, height].  @param bboxes: The bounding boxes to convert. @type bboxes: np.ndarray @return: The converted bounding boxes. @rtype: np.ndarray

function

unnormalize_image(image, normalize = True)

Un-normalize an image tensor by scaling it to the [0, 255] range.  @param image: The normalized image tensor of shape (H, W, C) or (C, H, W). @type image: np.ndarray @param normalize: Whether to normalize the image tensor. Defaults to True. @type normalize: bool @return: The un-normalized image. @rtype: np.ndarray

module

depthai_nodes.node.parsers.utils.fastsam

function

box_prompt(masks: np.ndarray, bbox: Tuple

[



int

, 



int

, 



int

, 



int

], orig_shape: Tuple

[



int

, 



int

]) -> np.ndarray: np.ndarray

Modifies the bounding box properties and calculates IoU between masks and bounding box.  Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L286 Modified so it uses numpy instead of torch.  @param masks: The resulting masks of the FastSAM model @type masks: np.ndarray @param bbox: The prompt bounding box coordinates @type bbox: Tuple[int, int, int, int] @param orig_shape: The original shape of the image @type orig_shape: Tuple[int, int] (height, width) @return: The modified masks @rtype: np.ndarray

function

format_results(bboxes: np.ndarray, masks: np.ndarray, filter: int = 0) -> List[Dict[str, Any]]: List[Dict[str, Any]]

Formats detection results into list of annotations each containing ID, segmentation, bounding box, score and area.  Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L56  @param bboxes: The bounding boxes of the detected objects @type bboxes: np.ndarray @param masks: The masks of the detected objects @type masks: np.ndarray @param filter: The filter value @type filter: int @return: The formatted annotations @rtype: List[Dict[str, Any]]

function

point_prompt(bboxes: np.ndarray, masks: np.ndarray, points: List

[



Tuple

[



int

, 



int

]

], pointlabel: List

[



int

], orig_shape: Tuple

[



int

, 



int

]) -> np.ndarray: np.ndarray

Adjusts points on detected masks based on user input and returns the modified results.  Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L321 Modified so it uses numpy instead of torch.  @param bboxes: The bounding boxes of the detected objects @type bboxes: np.ndarray @param masks: The masks of the detected objects @type masks: np.ndarray @param points: The points to adjust @type points: List[Tuple[int, int]] @param pointlabel: The point labels @type pointlabel: List[int] @param orig_shape: The original shape of the image @type orig_shape: Tuple[int, int] (height, width) @return: The modified masks @rtype: np.ndarray

function

adjust_bboxes_to_image_border(boxes: np.ndarray, image_shape: Tuple

[



int

, 



int

], threshold: int = 20) -> np.ndarray: np.ndarray

Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/utils.py#L6 (Ultralytics) Adjust bounding boxes to stick to image border if they are within a certain threshold.  @param boxes: Bounding boxes @type boxes: np.ndarray @param image_shape: Image shape @type image_shape: Tuple[int, int] @param threshold: Pixel threshold @type threshold: int @return: Adjusted bounding boxes @rtype: np.ndarray

function

bbox_iou(box1: np.ndarray, boxes: np.ndarray, iou_thres: float = 0.9, image_shape: Tuple

[



int

, 



int

] = (640, 640), raw_output: bool = False) -> np.ndarray: np.ndarray

Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/utils.py#L30 (Ultralytics - rewritten to numpy) Compute the Intersection-Over-Union of a bounding box with respect to an array of other bounding boxes.  @param box1: Array of shape (4, ) representing a single bounding box. @type box1: np.ndarray @param boxes: Array of shape (n, 4) representing multiple bounding boxes. @type boxes: np.ndarray @param iou_thres: IoU threshold @type iou_thres: float @param image_shape: Image shape (height, width) @type image_shape: Tuple[int, int] @param raw_output: If True, return the raw IoU values instead of the indices @type raw_output: bool @return: Indices of boxes with IoU > thres, or the raw IoU values if raw_output is True @rtype: np.ndarray

function

decode_fastsam_output(outputs: List

[



np.ndarray

], strides: List

[



int

], anchors: List

[



Optional

[



np.ndarray

]

], img_shape: Tuple

[



int

, 



int

], conf_thres: float = 0.5, iou_thres: float = 0.45, num_classes: int = 1) -> np.ndarray: np.ndarray

Decode the output of the FastSAM model.  @param outputs: List of FastSAM outputs @type outputs: List[np.ndarray] @param strides: List of strides @type strides: List[int] @param anchors: List of anchors @type anchors: List[Optional[np.ndarray]] @param img_shape: Image shape @type img_shape: Tuple[int, int] @param conf_thres: Confidence threshold @type conf_thres: float @param iou_thres: IoU threshold @type iou_thres: float @param num_classes: Number of classes @type num_classes: int @return: NMS output @rtype: np.ndarray

function

build_mask_coeffs(parsed_results: np.ndarray, masks_outputs_values: list

[



np.ndarray

], protos_len: int) -> np.ndarray: np.ndarray

Gather mask coefficients for all detections, grouped by head.  @param parsed_results: FastSAM decoded outputs @type parsed_results: np.ndarray @param masks_outputs_values: Model mask outputs @type masks_outputs_values: list[np.ndarray] @param protos_len: Number of protos @type protos_len: int

function

process_masks(parsed_results: np.ndarray, mask_coeffs: np.ndarray, protos: np.ndarray, orig_shape: Tuple

[



int

, 



int

], mask_conf: float) -> np.ndarray: np.ndarray

Process output into full-size masks for all detections.  @param parsed_results: FastSAM decoded outputs @type parsed_results: np.ndarray @param mask_coeffs: Mask coefficients @type mask_coeffs: np.ndarray @param protos: Protos from model output @type protos: np.ndarray @param orig_shape: Input shape of the model @type orig_shape: np.ndarray @param mask_conf: Mask confidence @type mask_conf: float

function

merge_masks(masks: np.ndarray) -> np.ndarray: np.ndarray

Merge masks to a 2D array where each object is represented by a unique label.  @param masks: 3D array of masks @type masks: np.ndarray @return: 2D array of masks @rtype: np.ndarray

module

depthai_nodes.node.parsers.utils.keypoints

function

normalize_keypoints(keypoints: np.ndarray, height: int, width: int) -> np.ndarray: np.ndarray

Normalize keypoint coordinates to (0, 1).  Parameters: @param keypoints: A numpy array of shape (N, 2) or (N, K, 2) where N is the number of keypoint sets and K is the number of keypoint in each set. @type np.ndarray @param height: The height of the image. @type height: int @param width: The width of the image. @type width: int  Returns: np.ndarray: A numpy array of shape (N, 2) containing the normalized keypoints.

module

depthai_nodes.node.parsers.utils.masks_utils

function

crop_mask(mask: np.ndarray, bbox: np.ndarray) -> np.ndarray: np.ndarray

It takes a mask and a bounding box, and returns a mask that is cropped to the bounding box.  @param mask: [h, w] numpy array of a single mask @type mask: np.ndarray @param bbox: A numpy array of bbox coordinates in (x_center, y_center, width,     height) format @type bbox: np.ndarray @return: A mask that is cropped to the bounding box @rtype: np.ndarray

function

process_single_mask(protos: np.ndarray, mask_coeff: np.ndarray, mask_conf: float, bbox: np.ndarray) -> np.ndarray: np.ndarray

Process a single mask.  @param protos: Protos. @type protos: np.ndarray @param mask_coeff: Mask coefficient. @type mask_coeff: np.ndarray @param mask_conf: Mask confidence. @type mask_conf: float @param bbox: A numpy array of bbox coordinates in (x_center, y_center, width,     height) normalized format. @type bbox: np.ndarray @return: Processed mask. @rtype: np.ndarray

function

get_segmentation_outputs(output: dai.NNData, mask_output_layer_names: Optional

[



List

[



str

]

] = None, protos_output_layer_name: Optional

[



str

] = None) -> Tuple[List[np.ndarray], np.ndarray, int]: Tuple[List[np.ndarray], np.ndarray, int]

Get the segmentation outputs from the Neural Network data.

module

depthai_nodes.node.parsers.utils.medipipe

class

HandRegion

Attributes: pd_score : detection score pd_box : detection box [x, y, w, h], normalized [0,1] in the squared image pd_kps : detection keypoints coordinates [x, y], normalized [0,1] in the squared image rect_x_center, rect_y_center : center coordinates of the rotated bounding rectangle, normalized [0,1] in the squared image rect_w, rect_h : width and height of the rotated bounding rectangle, normalized in the squared image (may be > 1) rotation : rotation angle of rotated bounding rectangle with y-axis in radian rect_x_center_a, rect_y_center_a : center coordinates of the rotated bounding rectangle, in pixels in the squared image rect_w, rect_h : width and height of the rotated bounding rectangle, in pixels in the squared image rect_points : list of the 4 points coordinates of the rotated bounding rectangle, in pixels         expressed in the squared image during processing,         expressed in the source rectangular image when returned to the user

variable

SSDAnchorOptions

function

calculate_scale(min_scale, max_scale, stride_index, num_strides)

function

generate_anchors(options)

option : SSDAnchorOptions # https://github.com/google/mediapipe/blob/master/mediapipe/calculators/tflite/ssd_anchors_calculator.cc

function

generate_handtracker_anchors(input_size_width, input_size_height)

function

decode_bboxes(score_thresh, scores, bboxes, anchors, scale = 128, best_only = False)

function

rect_transformation(regions, w, h, no_shift = False)

W, h : image input shape.

function

rotated_rect_to_points(cx, cy, w, h, rotation)

function

detections_to_rect(regions)

function

normalize_radians(angle)

function

decode(bboxes, scores, anchors, threshold = 0.5, scale = 192)

Generate anchors and decode bounding boxes for mediapipe hand detection model.

class

depthai_nodes.node.parsers.utils.medipipe.HandRegion

method

__init__(self, pd_score = None, pd_box = None, pd_kps = None)

variable

pd_score

variable

pd_box

variable

pd_kps

module

depthai_nodes.node.parsers.utils.mlsd

function

decode_scores_and_points(tpMap: np.ndarray, heat: np.ndarray, topk_n: int) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]

Decode the scores and points from the neural network output tensors. Used for MLSD model.  @param tpMap: Tensor containing the vector map. @type tpMap: np.ndarray @param heat: Tensor containing the heat map. @type heat: np.ndarray @param topk_n: Number of top candidates to keep. @type topk_n: int @return: Detected points, confidence scores for the detected points, and vector map. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]

function

get_lines(pts: np.ndarray, pts_score: np.ndarray, vmap: np.ndarray, score_thr: float, dist_thr: float, input_size: int = 512) -> Tuple[np.ndarray, List[float]]: Tuple[np.ndarray, List[float]]

Get lines from the detected points and scores. The lines are filtered by the score threshold and distance threshold. Used for MLSD model.  @param pts: Detected points. @type pts: np.ndarray @param pts_score: Confidence scores for the detected points. @type pts_score: np.ndarray @param vmap: Vector map. @type vmap: np.ndarray @param score_thr: Confidence score threshold for detected lines. @type score_thr: float @param dist_thr: Distance threshold for merging lines. @type dist_thr: float @param input_size: Input size of the model. @type input_size: int @return: Detected lines and their confidence scores. @rtype: Tuple[np.ndarray, List[float]]

module

depthai_nodes.node.parsers.utils.nms

function

nms(dets: np.ndarray, nms_thresh: float = 0.5) -> List[int]: List[int]

Non-maximum suppression.  @param dets: Bounding boxes and confidence scores. @type dets: np.ndarray @param nms_thresh: Non-maximum suppression threshold. @type nms_thresh: float @return: Indices of the detections to keep. @rtype: List[int]

function

nms_cv2(bboxes: np.ndarray, scores: np.ndarray, conf_threshold: float, iou_threshold: float, max_det: int)

Non-maximum suppression from the opencv-python library.  @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type bboxes: np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type scores: np.ndarray @param nms_thresh: Non-maximum suppression threshold. @type nms_thresh: float @return: Indices of the detections to keep. @rtype: List[int]

module

depthai_nodes.node.parsers.utils.ppdet

function

parse_paddle_detection_outputs(predictions: np.ndarray, mask_threshold: float = 0.25, bbox_threshold: float = 0.5, max_detections: int = 100, width: Optional

[



int

] = None, height: Optional

[



int

] = None) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]

Parse the output of a PaddlePaddle Text Detection model from a mask of text probabilities into rotated bounding boxes with additional corners saved as keypoints.  @param predictions: The output of a PaddlePaddle Text Detection model. @type predictions: np.ndarray @param mask_threshold: The threshold for the mask. @type mask_threshold: float @param bbox_threshold: The threshold for bounding boxes. @type bbox_threshold: float @param max_detections: The maximum number of candidate bounding boxes. @type max_detections: int @param width: The width of the image. @type width: Optional[int] @param height: The height of the image. @type height: Optional[int] @return: A touple containing the rotated bounding boxes, corners and scores. @rtype: Touple[np.ndarray, np.ndarray, np.ndarray]

module

depthai_nodes.node.parsers.utils.scrfd

function

compute_anchor_centers(strides: List

[



int

], input_size: Tuple

[



int

, 



int

], num_anchors: int) -> Dict[int, np.ndarray]: Dict[int, np.ndarray]

Compute the anchor centers for a given list of strides, input size, and number of anchors.  @param strides: List of strides. @type strides: List[int] @param input_size: Input size. @type input_size: Tuple[int, int] @param num_anchors: Number of anchors. @type num_anchors: int @return: Dictionary of anchor centers. @rtype: Dict[int, np.ndarray]

function

distance2bbox(points, distance, max_shape = None)

Decode distance prediction to bounding box.  @param points: Shape (n, 2), [x, y]. @type points: np.ndarray @param distance: Distance from the given point to 4 boundaries (left, top, right,     bottom). @type distance: np.ndarray @param max_shape: Shape of the image. @type max_shape: Tuple[int, int] @return: Decoded bboxes. @rtype: np.ndarray

function

distance2kps(points, distance, max_shape = None)

Decode distance prediction to keypoints.  @param points: Shape (n, 2), [x, y]. @type points: np.ndarray @param distance: Distance from the given point to 4 boundaries (left, top, right,     bottom). @type distance: np.ndarray @param max_shape: Shape of the image. @type max_shape: Tuple[int, int] @return: Decoded keypoints. @rtype: np.ndarray

function

decode_scrfd(bboxes_concatenated, scores_concatenated, kps_concatenated, feat_stride_fpn, input_size, num_anchors, score_threshold, nms_threshold, anchors)

Decode the detection results of SCRFD.  @param bboxes_concatenated: List of bounding box predictions for each scale. @type bboxes_concatenated: list[np.ndarray] @param scores_concatenated: List of confidence score predictions for each scale. @type scores_concatenated: list[np.ndarray] @param kps_concatenated: List of keypoint predictions for each scale. @type kps_concatenated: list[np.ndarray] @param feat_stride_fpn: List of feature strides for each scale. @type feat_stride_fpn: list[int] @param input_size: Input size of the model. @type input_size: tuple[int] @param num_anchors: Number of anchors. @type num_anchors: int @param score_threshold: Confidence score threshold. @type score_threshold: float @param nms_threshold: Non-maximum suppression threshold. @type nms_threshold: float @param anchors: Dictionary of anchors. @type anchors: dict[int, np.ndarray] @return: Bounding boxes, confidence scores, and keypoints of detected objects. @rtype: tuple[np.ndarray, np.ndarray, np.ndarray]

module

depthai_nodes.node.parsers.utils.superanimal

function

get_top_values(heatmap)

Get the top values from the heatmap tensor.  @param heatmap: Heatmap tensor. @type heatmap: np.ndarray @return: Y and X coordinates of the top values. @rtype: Tuple[np.ndarray, np.ndarray]

function

get_pose_prediction(heatmap, locref, scale_factors)

Get the pose prediction from the heatmap and locref tensors. Used for SuperAnimal model.  @param heatmap: Heatmap tensor. @type heatmap: np.ndarray @param locref: Locref tensor. @type locref: np.ndarray @param scale_factors: Scale factors for the x and y axes. @type scale_factors: Tuple[float, float] @return: Pose prediction. @rtype: np.ndarray

module

depthai_nodes.node.parsers.utils.ufld

function

decode_ufld(anchors: List

[



int

], griding_num: int, cls_num_per_lane: int, input_width: int, input_height: int, y: np.ndarray) -> List[List[Tuple[int, int]]]: List[List[Tuple[int, int]]]

module

depthai_nodes.node.parsers.utils.xfeat

function

local_maximum_filter(x: np.ndarray, kernel_size: int) -> np.ndarray: np.ndarray

Apply a local maximum filter to the input array.  @param x: Input array. @type x: np.ndarray @param kernel_size: Size of the local maximum filter. @type kernel_size: int @return: Output array after applying the local maximum filter. @rtype: np.ndarray

function

normgrid(x, H, W)

Normalize coords to [-1,1].  @param x: Input coordinates, shape (N, Hg, Wg, 2) @type x: np.ndarray @param H: Height of the output feature map @type H: int @param W: Width of the output feature map @type W: int @return: Normalized coordinates, shape (N, Hg, Wg, 2) @rtype: np.ndarray

function

bilinear(im, pos, H, W)

Given an input and a flow-field grid, computes the output using input values and pixel locations from grid. Supported only bilinear interpolation method to sample the input pixels.  @param im: Input feature map, shape (N, C, H, W) @type im: np.ndarray @param pos: Point coordinates, shape (N, Hg, Wg, 2) @type pos: np.ndarray @param H: Height of the output feature map @type H: int @param W: Width of the output feature map @type W: int @return: A tensor with sampled points, shape (N, C, Hg, Wg) @rtype: np.ndarray

function

detect_and_compute(feats: np.ndarray, kpts: np.ndarray, heatmaps: np.ndarray, resize_rate_w: float, resize_rate_h: float, input_size: Tuple

[



int

, 



int

], top_k: int = 4096) -> List[Dict[str, Any]]: List[Dict[str, Any]]

Detect and compute keypoints.  @param feats: Features. @type feats: np.ndarray @param kpts: Keypoints. @type kpts: np.ndarray @param heatmaps: Heatmaps. @type heatmaps: np.ndarray @param resize_rate_w: Resize rate for width. @type resize_rate_w: float @param resize_rate_h: Resize rate for height. @type resize_rate_h: float @param input_size: Input size. @type input_size: Tuple[int, int] @param top_k: Maximum number of keypoints to keep. @type top_k: int @return: List of dictionaries containing keypoints, scores, and descriptors. @rtype: List[Dict[str, Any]]

function

match(result1: Dict

[



str

, 



Any

], result2: Dict

[



str

, 



Any

], min_cossim: float = -1) -> Tuple[np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray]

Match keypoints.  @param result1: Result 1. @type result1: Dict[str, Any] @param result2: Result 2. @type result2: Dict[str, Any] @param min_cossim: Minimum cosine similarity. @type min_cossim: float @return: Matched keypoints. @rtype: Tuple[np.ndarray, np.ndarray]

module

depthai_nodes.node.parsers.utils.yolo

variable

logger

class

YOLOSubtype

function

make_grid_numpy(ny: int, nx: int, na: int) -> np.ndarray: np.ndarray

Create a grid of shape (1, na, ny, nx, 2)  @param ny: Number of y coordinates. @type ny: int @param nx: Number of x coordinates. @type nx: int @param na: Number of anchors. @type na: int @return: Grid. @rtype: np.ndarray

function

non_max_suppression(prediction: np.ndarray, conf_thres: float = 0.5, iou_thres: float = 0.45, classes: Optional

[



List

] = None, num_classes: int = 1, agnostic: bool = False, multi_label: bool = False, max_det: int = 300, max_time_img: float = 0.05, max_nms: int = 30000, max_wh: int = 7680, kpts_mode: bool = False, det_mode: bool = False) -> List[np.ndarray]: List[np.ndarray]

Performs Non-Maximum Suppression (NMS) on inference results.  @param prediction: Prediction from the model, shape = (batch_size, boxes, xy+wh+...) @type prediction: np.ndarray @param conf_thres: Confidence threshold. @type conf_thres: float @param iou_thres: Intersection over union threshold. @type iou_thres: float @param classes: For filtering by classes. @type classes: Optional[List] @param num_classes: Number of classes. @type num_classes: int @param agnostic: Runs NMS on all boxes together rather than per class if True. @type agnostic: bool @param multi_label: Multilabel classification. @type multi_label: bool @param max_det: Limiting detections. @type max_det: int @param max_time_img: Maximum time for processing an image. @type max_time_img: float @param max_nms: Maximum number of boxes. @type max_nms: int @param max_wh: Maximum width and height. @type max_wh: int @param kpts_mode: Keypoints mode. @type kpts_mode: bool @param det_mode: Detection only mode. If True, the output will only contain bbox detections. @type det_mode: bool @return: An array of detections. If det_mode is False, the detections may include kpts or segmentation outputs. @rtype: List[np.ndarray]

function

parse_yolo_output(out: np.ndarray, stride: int, num_outputs: int, anchors: Optional

[



np.ndarray

] = None, head_id: int = -1, kpts: Optional

[



np.ndarray

] = None, det_mode: bool = False, subtype: YOLOSubtype = YOLOSubtype.DEFAULT) -> np.ndarray: np.ndarray

Parse a single channel output of an YOLO model.  @param out: A single output of an YOLO model for the given channel. @type out: np.ndarray @param stride: Stride. @type stride: int @param num_outputs: Number of outputs of the model. @type num_outputs: int @param anchors: Anchors for the given head. @type anchors: Optional[np.ndarray] @param head_id: Head ID. @type head_id: int @param kpts: A single output of keypoints for the given channel. @type kpts: Optional[np.ndarray] @param det_mode: Detection only mode. @type det_mode: bool @param subtype: YOLO version. @type subtype: YOLOSubtype @return: Parsed output. @rtype: np.ndarray

function

parse_kpts(kpts: np.ndarray, n_keypoints: int, img_shape: Tuple

[



int

, 



int

]) -> List[Tuple[float, float, float]]: List[Tuple[float, float, float]]

Parse keypoints.  @param kpts: Result keypoints. @type kpts: np.ndarray @param n_keypoints: Number of keypoints. @type n_keypoints: int @param img_shape: Image shape of the model input in (height, width) format. @type img_shape: Tuple[int, int] @return: Parsed keypoints. @rtype: List[Tuple[float, float, float]]

function

decode_yolo26(raw: np.ndarray, conf_threshold: float, max_det: int, extra_raw: Optional

[



np.ndarray

] = None) -> Tuple[np.ndarray, Optional[np.ndarray]]: Tuple[np.ndarray, Optional[np.ndarray]]

Decode YOLO26 output for detection, segmentation, or pose.  YOLO26 end2end output is already decoded (xyxy in pixels) with a pre-computed confidence score (ReduceMax over class scores). Needs topk and conf thresholding. Optionally filters an auxiliary tensor (mask coefficients or keypoints) with the detections.  @param raw: Raw detection tensor (N, A, 5+nc) where columns are     [x1, y1, x2, y2, conf, cls_0, ..., cls_nc-1]. @type raw: np.ndarray @param conf_threshold: Confidence threshold. @type conf_threshold: float @param max_det: Maximum number of detections. @type max_det: int @param extra_raw: Optional auxiliary tensor (N, A, M) such as mask coefficients or     keypoints. When provided the kept rows are returned as the second element. @type extra_raw: Optional[np.ndarray] @return: Tuple of (detection results (K, 6), kept auxiliary data (K, M) or None). @rtype: Tuple[np.ndarray, Optional[np.ndarray]]

function

decode_yolo_output(yolo_outputs: List

[



np.ndarray

], strides: List

[



int

], anchors: Optional

[



np.ndarray

] = None, kpts: Optional

[



List

[



np.ndarray

]

] = None, conf_thres: float = 0.5, iou_thres: float = 0.45, num_classes: int = 1, det_mode: bool = False, subtype: YOLOSubtype = YOLOSubtype.DEFAULT, max_nms: int = 3000) -> np.ndarray: np.ndarray

Decode the output of an YOLO instance segmentation or pose estimation model.  @param yolo_outputs: List of YOLO outputs. @type yolo_outputs: List[np.ndarray] @param strides: List of strides. @type strides: List[int] @param anchors: An optional array of anchors. @type anchors: Optional[np.ndarray] @param kpts: An optional list of keypoints. @type kpts: Optional[List[np.ndarray]] @param conf_thres: Confidence threshold. @type conf_thres: float @param iou_thres: Intersection over union threshold. @type iou_thres: float @param num_classes: Number of classes. @type num_classes: int @param det_mode: Detection only mode. If True, the output will only contain bbox     detections. @type det_mode: bool @param subtype: YOLO version. @type subtype: YOLOSubtype @param max_nms: Maximum number of boxes to keep after NMS. @type max_nms: int @return: NMS output. @rtype: np.ndarray

class

depthai_nodes.node.parsers.utils.yolo.YOLOSubtype(str, enum.Enum)

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

constant

module

depthai_nodes.node.parsers.utils.yunet

function

manual_product(args)

You can use this function instead of itertools.product.

function

generate_anchors(input_size: Tuple

[



int

, 



int

], min_sizes: Optional

[



List

[



List

[



int

]

]

] = None, strides: Optional

[



List

[



int

]

] = None)

Generate a set of default bounding boxes, known as anchors. The code is taken from https://github.com/Kazuhito00/YuNet-ONNX-TFLite-Sample/tree/main  @param input_size: A tuple representing the width and height of the input image. @type input_size: Tuple[int, int] @param min_sizes: A list of lists, where each inner list contains the minimum sizes of the anchors for different feature maps. If None then '[[10, 16, 24], [32, 48], [64, 96], [128, 192, 256]]' will be used. Defaults to None. @type min_sizes Optional[List[List[int]]] @param strides: Strides for each feature map layer. If None then '[8, 16, 32, 64]' will be used. Defaults to None. @type strides: Optional[List[int]] @return: Anchors. @rtype: np.ndarray

function

decode_detections(input_size: Tuple

[



int

, 



int

], loc: np.ndarray, conf: np.ndarray, iou: np.ndarray, variance: Optional

[



List

[



float

]

] = None)

Decodes the output of an object detection model by converting the model's predictions (localization, confidence, and IoU scores) into bounding boxes, keypoints, and scores. The code is taken from https://github.com/Kazuhito00/YuNet-ONNX-TFLite-Sample/tree/main  @param input_size: The size of the input image (height, width). @type input_size: tuple @param loc: The predicted locations (or offsets) of the bounding boxes. @type loc: np.ndarray @param conf: The predicted class confidence scores. @type conf: np.ndarray @param iou: The predicted IoU (Intersection over Union) scores. @type iou: np.ndarray @param variance: A list of variances used to decode the bounding box predictions. If None then [0.1,0.2] will be used. Defaults to None. @type variance: Optional[List[float]] @return: A tuple of bboxes, keypoints, and scores.     - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height].     - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor.     - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]

function

prune_detections(bboxes: np.ndarray, keypoints: np.ndarray, scores: np.ndarray, conf_threshold: float)

Prune detections based on confidence threshold.  Parameters: @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param keypoints: A numpy array of shape (N, 10) containing the keypoints. @type np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type np.ndarray @param conf_threshold: The confidence threshold. @type float @return: A tuple of bboxes, keypoints, and scores.     - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height].     - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor.     - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]

function

format_detections(bboxes: np.ndarray, keypoints: np.ndarray, scores: np.ndarray, input_size: Tuple

[



int

, 



int

])

Format detections into a list of dictionaries.  @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param keypoints: A numpy array of shape (N, 10) containing the keypoints. @type np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type np.ndarray @param input_size: A tuple representing the width and height of the input image. @type input_size: tuple @return: A tuple of bboxes, keypoints, and scores.     - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height].     - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor.     - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]

function

decode_and_prune_detections(input_size: Tuple

[



int

, 



int

], loc: np.ndarray, conf: np.ndarray, iou: np.ndarray, conf_threshold: float, anchors: np.ndarray, variance: Optional

[



List

[



float

]

] = None)

Optimized function that combines decode_detections and prune_detections. Performs early pruning to avoid processing low-confidence detections.  @param input_size: The size of the input image (width, height). @param loc: The predicted locations (or offsets) of the bounding boxes. @param conf: The predicted class confidence scores. @param iou: The predicted IoU (Intersection over Union) scores. @param conf_threshold: The confidence threshold for pruning. @param anchors: Pre-computed anchors to avoid regeneration. @param variance: A list of variances used to decode the bounding box predictions. @return: A tuple of bboxes, keypoints, and scores.

module

depthai_nodes.node.parsers.xfeat

class

XFeatBaseParser

Base parser class for parsing the output of the XFeat model. It is the parent class of the XFeatMonoParser and XFeatStereoParser classes.  Attributes ---------- reference_input : Node.Input     Reference input for stereo mode. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. target_input : Node.Input     Target input for stereo mode. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. output_layer_feats : str     Name of the output layer containing features. output_layer_keypoints : str     Name of the output layer containing keypoints. output_layer_heatmaps : str     Name of the output layer containing heatmaps. original_size : Tuple[float, float]     Original image size. input_size : Tuple[float, float]     Input image size. max_keypoints : int     Maximum number of keypoints to keep.  Error Handling -------------- **ValueError**: If the number of output layers is not E{3}. **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.

class

depthai_nodes.node.parsers.xfeat.XFeatBaseParser(depthai_nodes.node.parsers.base_parser.BaseParser)

method

__init__(self, output_layer_feats: str = '', output_layer_keypoints: str = '', output_layer_heatmaps: str = '', original_size: Tuple

[



float

, 



float

] = None, input_size: Tuple

[



float

, 



float

] = (640, 352), max_keypoints: int = 4096)

Initializes the parser node.

variable

output_layer_feats

variable

output_layer_keypoints

variable

output_layer_heatmaps

variable

variable

variable

property

Returns the reference input.

property

target_input

Returns the target input.

method

reference_input.setter(self, reference_input: Optional

[



dai.Node.Input

])

Sets the reference input.

variable

input

method

target_input.setter(self, target_input: Optional

[



dai.Node.Input

])

Sets the target input.

method

setOutputLayerFeats(self, output_layer_feats: str)

Sets the output layer containing features.  @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str

method

setOutputLayerKeypoints(self, output_layer_keypoints: str)

Sets the output layer containing keypoints.  @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str

method

setOutputLayerHeatmaps(self, output_layer_heatmaps: str)

Sets the output layer containing heatmaps.  @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str

method

setOriginalSize(self, original_size: Tuple

[



int

, 



int

])

Sets the original image size.  @param original_size: Original image size. @type original_size: Tuple[int, int]

method

setInputSize(self, input_size: Tuple

[



int

, 



int

])

Sets the input image size.  @param input_size: Input image size. @type input_size: Tuple[int, int]

method

setMaxKeypoints(self, max_keypoints: int)

Sets the maximum number of keypoints to keep.  @param max_keypoints: Maximum number of keypoints. @type max_keypoints: int

method

build(self, head_config: Dict

[



str

, 



Any

]) -> XFeatBaseParser: XFeatBaseParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: XFeatBaseParser

method

validateParams(self)

Validates the parameters.

method

extractTensors(self, output: dai.NNData) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]

Extracts the tensors from the output. It returns the features, keypoints, and heatmaps. It also handles the reshaping of the tensors by requesting the NCHW storage order.  @param output: Output from the Neural Network node. @type output: dai.NNData @return: Tuple of features, keypoints, and heatmaps. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]

class

depthai_nodes.node.parsers.HRNetParser(depthai_nodes.node.parsers.KeypointParser)

method

__init__(self, output_layer_name: str = '', score_threshold: float = 0.5, label_names: Optional

[



List

[



str

]

] = None, edges: Optional

[



List

[



Tuple

[



int

, 



int

]

]

] = None)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param score_threshold: Confidence score threshold for detected keypoints. @type score_threshold: float @param label_names: Label names for the keypoints. @type label_names: Optional[List[str]] @param edges: Keypoint connection pairs for visualizing the skeleton. Example:     [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint     1, keypoint 1 is connected to keypoint 2, etc. @type edges: Optional[List[Tuple[int, int]]]

method

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

variable

method

build(self, head_config: Dict

[



str

, 



Any

]) -> HRNetParser: HRNetParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: HRNetParser

method

variable

n_keypoints

class

depthai_nodes.node.parsers.KeypointParser(depthai_nodes.node.BaseParser)

method

__init__(self, output_layer_name: str = '', scale_factor: float = 1.0, n_keypoints: int = None, score_threshold: float = None, label_names: Optional

[



List

[



str

]

] = None, edges: Optional

[



List

[



List

[



int

]

]

] = None)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float @param n_keypoints: Number of keypoints. @type n_keypoints: int @param label_names: Label names for the keypoints. @type label_names: Optional[List[str]] @param edges: Keypoint connection pairs for visualizing the skeleton. Example:     [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint     1, keypoint 1 is connected to keypoint 2, etc. @type edges: Optional[List[Tuple[int, int]]]

variable

variable

variable

variable

variable

variable

method

setScaleFactor(self, scale_factor: float)

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

Sets the scale factor to divide the keypoints by.  @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float

method

setNumKeypoints(self, n_keypoints: int)

Sets the number of keypoints.  @param n_keypoints: Number of keypoints. @type n_keypoints: int

method

setScoreThreshold(self, threshold: float)

Sets the confidence score threshold for the detected body keypoints.  @param threshold: Confidence score threshold for detected keypoints. @type threshold: float

method

setLabelNames(self, label_names: List

[



str

])

Sets the label names for the keypoints.  @param label_names: List of label names for the keypoints. @type label_names: List[str]

method

setEdges(self, edges: List

[



Tuple

[



int

, 



int

]

])

Sets the edges for the keypoints.  @param edges: List of edges for the keypoints. Example: [(0,1), (1,2), (2,3),     (3,0)] shows that keypoint 0 is connected to keypoint 1, keypoint 1 is     connected to keypoint 2, etc. @type edges: List[Tuple[int, int]]

method

build(self, head_config: Dict

[



str

, 



Any

]) -> KeypointParser: KeypointParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: KeypointParser

method

class

depthai_nodes.node.parsers.SuperAnimalParser(depthai_nodes.node.parsers.KeypointParser)

method

__init__(self, output_layer_name: str = '', scale_factor: float = 256.0, n_keypoints: int = 39, score_threshold: float = 0.5, label_names: Optional

[



List

[



str

]

] = None, edges: Optional

[



List

[



Tuple

[



int

, 



int

]

]

] = None)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param n_keypoints: Number of keypoints. @type n_keypoints: int @param score_threshold: Confidence score threshold for detected keypoints. @type score_threshold: float @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float @param label_names: Label names for the keypoints. @type label_names: Optional[List[str]] @param edges: Keypoint connection pairs for visualizing the skeleton. Example:     [(0,1), (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint     1, keypoint 1 is connected to keypoint 2, etc. @type edges: Optional[List[Tuple[int, int]]]

method

build(self, head_config: Dict

[



str

, 



Any

]) -> SuperAnimalParser: SuperAnimalParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SuperAnimalParser

method

variable

gridSize: Tuple[int, int]

module

depthai_nodes.node.parsing_neural_network

type variable

TParser

module

depthai_nodes.node.snaps_uploader

variable

logger

module

depthai_nodes.node.tiling

class

TilingCfg

class

depthai_nodes.node.tiling.TilingCfg

variable

overlap: float

variable

variable

canvasShape: Tuple[int, int]

variable

resizeShape: Tuple[int, int]

variable

resizeMode: dai.ImageManipConfig.ResizeMode

variable

globalDetection: bool

variable

gridMatrix: Union[np.ndarray, List, None]

package

depthai_nodes.node.utils

module

detection_config_generator

module

message_remapping

module

nms

function

to_planar(arr: np.ndarray, shape: Tuple) -> np.ndarray: np.ndarray

Converts the input image `arr` (NumPy array) to the planar format expected by depthai. The image is resized to the dimensions specified in `shape`.  @param arr: Input NumPy array (image). @type arr: np.ndarray @param shape: Target dimensions (width, height). @type shape: tuple @return: A 1D NumPy array with the planar image data. @rtype: np.ndarray

module

util_constants

function

generate_script_content(resize_width: int, resize_height: int, resize_mode: str = 'STRETCH', padding: float = 0.0) -> str: str

Generates the script content for the dai.Script node.  It crops and resizes the input image based on the detected object, with optional padding and label filtering. If a zero-area detection is encountered, an error message is issued.  @param resize_width: Target width for the resized image @type resize_width: int @param resize_height: Target height for the resized image @type resize_height: int @param resize_mode: Resize mode for the image. Supported values: "CENTER_CROP",     "LETTERBOX", "NONE", "STRETCH". Default: "STRETCH".     "STRETCH" - stretches the image so that the corners of the region are now in the corners of the output image.     "CENTER_CROP" - resizes + crops the image to keep aspect ratio and fill the final size.     "LETTERBOX" - resizes + pads the image to the final size to keep aspect ratio.     "NONE" - does not scale and pads top, bottom, left and right to fill the final image. @type resize_mode: str @param padding: Additional padding around the detection in normalized coordinates     (0-1) @type padding: float @param valid_labels: List of valid label indices to filter detections. If None, all     detections are processed @type valid_labels: Optional[List[int]] @return: Generated script content as a string @rtype: str

function

nms_detections(detections: List

[



dai.ImgDetection

], conf_thresh = 0.3, iou_thresh = 0.4)

Applies Non-Maximum Suppression (NMS) on a list of dai.ImgDetection objects.  @param detections: List of dai.ImgDetection objects. @type detections: list[dai.ImgDetection] @param conf_thresh: Confidence threshold for filtering boxes. @type conf_thresh: float @param iou_thresh: IoU threshold for Non-Maximum Suppression (NMS). @type iou_thresh: float  @return: A list of dai.ImgDetection objects after applying NMS. @rtype: list[dai.ImgDetection]

module

depthai_nodes.node.utils.message_remapping

function

remap_message(message: GMessage, from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation) -> GMessage: GMessage

function

remap_img_detections(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, detections: dai.ImgDetections) -> dai.ImgDetections: dai.ImgDetections

function

remap_segmentation_mask_array(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, segmentation_mask: np.ndarray) -> np.ndarray: np.ndarray

function

remap_segmentation_mask(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, segmentation_mask: SegmentationMask) -> SegmentationMask: SegmentationMask

function

remap_img_detection(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, img_detection: dai.ImgDetection) -> dai.ImgDetection: dai.ImgDetection

function

remap_keypoint(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, keypoint: dai.Keypoint) -> dai.Keypoint: dai.Keypoint

function

remap_native_keypoints(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, keypoints: dai.KeypointsList) -> dai.KeypointsList: dai.KeypointsList

function

remap_keypoints(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, keypoints: Keypoints) -> Keypoints: Keypoints

function

remap_clusters(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, clusters: Clusters) -> Clusters: Clusters

function

remap_cluster(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, cluster: Cluster) -> Cluster: Cluster

function

remap_map2d(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, map2d: Map2D) -> Map2D: Map2D

function

remap_lines(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, lines: Lines) -> Lines: Lines

function

remap_line(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, line: Line) -> Line: Line

function

remap_predictions(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, predictions: Predictions) -> Predictions: Predictions

function

remap_prediction(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, prediction: Prediction) -> Prediction: Prediction

function

remap_classifications(from_transformation: dai.ImgTransformation, to_transformation: dai.ImgTransformation, classifications: Classifications) -> Classifications: Classifications

module

depthai_nodes.node.utils.nms

function

nms(boxes, scores, iou_thresh)

Perform Non-Maximum Suppression (NMS).  @param boxes: An ndarray of shape (N, 4), where each row is [xmin, ymin, xmax,     ymax]. @type boxes: np.ndarray @param scores: An ndarray of shape (N,), containing the confidence scores for each     box. @type scores: np.ndarray @param iou_thresh: The IoU threshold for Non-Maximum Suppression (NMS). @type iou_thresh: float @return: A list of indices of the boxes to keep after applying NMS. @rtype: list[int]

module

depthai_nodes.node.utils.util_constants

type variable

GMessage

constant

UNASSIGNED_MASK_LABEL: int

class

depthai_nodes.node.ApplyColormap(depthai_nodes.node.BaseHostNode)

method

__init__(self, colormapValue: Union

[



int

, 



np.ndarray

] = cv2.COLORMAP_JET, maxValue: int = 0)

method

setColormap(self, colormapValue: Union

[



int

, 



np.ndarray

])

Set the color mapping applied to incoming maps.  @param colormapValue: OpenCV colormap enum value or a custom OpenCV-compatible     LUT. @type colormapValue: Union[int, np.ndarray]

method

setMaxValue(self, maxValue: int)

Set the normalization ceiling used during colorization.  @param maxValue: Maximum input value used for normalization. 0 keeps per-frame     normalization. @type maxValue: int

method

build(self, frame: dai.Node.Output) -> ApplyColormap: ApplyColormap

Connect the input stream to the node.  @param frame: Upstream output producing the map-like message to colorize. @type frame: dai.Node.Output @return: The configured node instance. @rtype: ApplyColormap

method

process(self, frame: dai.Buffer)

Convert the incoming map-like message into a colorized image frame.

class

depthai_nodes.node.ApplyDepthColormap(depthai_nodes.node.BaseHostNode)

method

__init__(self, colormapValue: Union

[



int

, 



np.ndarray

] = cv2.COLORMAP_JET, pLow: float = 2.0, pHigh: float = 98.0)

method

setColormap(self, colormapValue: Union

[



int

, 



np.ndarray

])

Set the color mapping applied to depth images.  @param colormapValue: OpenCV colormap enum value or a custom OpenCV-compatible     LUT. @type colormapValue: Union[int, np.ndarray]

method

setPercentileRange(self, low: float, high: float)

Set the percentile clipping range used for normalization.  @param low: Lower percentile in the range [0, 100). @type low: float @param high: Upper percentile in the range (0, 100]. @type high: float

method

build(self, frame: dai.Node.Output) -> ApplyDepthColormap: ApplyDepthColormap

Connect the input depth stream to the node.  @param frame: Upstream output producing a RAW depth dai.ImgFrame. @type frame: dai.Node.Output @return: The configured node instance. @rtype: ApplyDepthColormap

method

process(self, frame: dai.Buffer)

Convert the incoming depth frame into a colorized image frame.

class

depthai_nodes.node.BaseHostNode(depthai.node.HostNode)

constant

IMG_FRAME_TYPES

method

process(self, msgs: dai.Buffer)

method

Process one synchronized batch of input messages.

class

depthai_nodes.node.BaseThreadedHostNode(depthai.node.ThreadedHostNode)

constant

IMG_FRAME_TYPES

method

method

Run the threaded host-node loop.

class

depthai_nodes.node.CoordinatesMapper(depthai_nodes.node.BaseThreadedHostNode)

constant

SCRIPT_CONTENT: str

method

property

Return the remapped output stream.

method

build(self, toTransformationInput: dai.Node.Output, fromTransformationInput: dai.Node.Output) -> CoordinatesMapper: CoordinatesMapper

Connect the target and source streams used for coordinate remapping.  @param toTransformationInput: Stream providing messages whose transformation     defines the target reference frame. @type toTransformationInput: dai.Node.Output @param fromTransformationInput: Stream providing messages whose coordinates     should be remapped. @type fromTransformationInput: dai.Node.Output @return: The configured node instance. @rtype: CoordinatesMapper

method

__init__(self, shrinkingFactor: float = 0)

Cache the latest target transformation and remap incoming messages.

class

depthai_nodes.node.DepthMerger(depthai_nodes.node.BaseHostNode)

method

variable

output

variable

shrinking_factor

method

build(self, output2d: dai.Node.Output, outputDepth: dai.Node.Output, calibData: dai.CalibrationHandler, depthAlignmentSocket: dai.CameraBoardSocket = dai.CameraBoardSocket.CAM_A, shrinkingFactor: float = 0) -> DepthMerger: DepthMerger

Connect detection and depth streams and initialize spatial conversion.  @param output2d: Upstream output producing 2D detections. @type output2d: dai.Node.Output @param outputDepth: Upstream output producing aligned depth frames. @type outputDepth: dai.Node.Output @param calibData: Device calibration used to convert image coordinates into     spatial coordinates. @type calibData: dai.CalibrationHandler @param depthAlignmentSocket: Camera socket the depth frame is aligned to. @type depthAlignmentSocket: dai.CameraBoardSocket @param shrinkingFactor: Percentage of each bounding box edge trimmed before     depth averaging. @type shrinkingFactor: float @return: The configured node instance. @rtype: DepthMerger

variable

host_spatials_calc

method

process(self, message2d: dai.Buffer, depth: dai.ImgFrame)

Merge incoming detections with depth to produce spatial detections.

class

depthai_nodes.node.ExtendedNeuralNetwork(depthai_nodes.node.BaseThreadedHostNode)

method

property

Return the primary parsed output stream.

property

outputs

Return the multi-head output stream when available.

property

passthrough

Return the passthrough stream from the underlying NN node.

method

build(self, inputImage: dai.node.Camera

|

dai.Node.Output, nnSource: Union

[



dai.NNModelDescription

, 



dai.NNArchive

, 



str

], resizeMode: dai.ImageManipConfig.ResizeMode = dai.ImageManipConfig.ResizeMode.CENTER_CROP) -> ExtendedNeuralNetwork: ExtendedNeuralNetwork

Build the internal inference pipeline.  @param inputImage: Source of input frames. Camera nodes are resized on-device by     the camera; generic outputs are resized via an internal ImageManip. @type inputImage: dai.node.Camera | dai.Node.Output @param nnSource: HubAI model slug, dai.NNModelDescription, or dai.NNArchive. @type nnSource: Union[dai.NNModelDescription, dai.NNArchive, str] @param resizeMode: Resize strategy used when adapting frames to the network     input shape. @type resizeMode: dai.ImageManipConfig.ResizeMode @return: The configured node instance. @rtype: ExtendedNeuralNetwork

method

IMG_DETECTIONS_SCRIPT_CONTENT

No-op required by ``BaseThreadedHostNode``.

class

depthai_nodes.node.FrameCropper(depthai_nodes.node.BaseThreadedHostNode)

constant

constant

MANIP_CONFIGS_SCRIPT_CONTENT

method

property

build(self, inputImage: dai.Node.Output) -> FrameCropper: FrameCropper

Return the cropped frame output stream.

method

fromImgDetections(self, inputImgDetections: dai.Node.Output, outputSize: Tuple

[



int

, 



int

], resizeMode: dai.ImageManipConfig.ResizeMode = dai.ImageManipConfig.ResizeMode.CENTER_CROP, padding: float = 0.0) -> FrameCropper: FrameCropper

Configure cropping from an ImgDetections stream.  In this mode the node generates ImageManipConfig messages per detection (via Script) and outputs one cropped ImgFrame per detection. `padding` expands the crop region.

method

fromManipConfigs(self, inputManipConfigs: dai.Node.Output, maxOutputFrameSize: int, waitForConfig: bool) -> FrameCropper: FrameCropper

Configure cropping from a stream of precomputed ImageManipConfig groups.  Expects `inputManipConfigs` to output dai.MessageGroup messages where each value is an ImageManipConfig. An on-device Script node pairs each config with the current frame and forwards them to ImageManip.  Key naming is arbitrary; all values in the MessageGroup are treated as configs.

method

Build the internal pipeline and set output size / resize behavior.  Requires that exactly one configuration path was selected via `fromImgDetections` or `fromManipConfigs` before calling. Returns `self` for fluent chaining.

method

FPS_TOLERANCE_DIVISOR: float

No-op because cropping is driven entirely by on-device Script nodes.

class

depthai_nodes.node.GatherData(depthai.node.ThreadedHostNode, typing.Generic)

constant

constant

INPUT_CHECKS_PER_FPS: int

method

Initializes the GatherData node.

property

setCameraFps(self, fps: int)

Return the gathered output stream.

method

Set the camera frame rate used for timestamp matching.  @param fps: Positive camera frame rate used for matching tolerance and polling. @type fps: int

method

setWaitCountFn(self, fn: Callable

[



[



TReference

]

, 



int

])

Set the callback that returns the expected item count per reference.

method

build(self, cameraFps: int, inputData: dai.Node.Output, inputReference: dai.Node.Output, waitCountFn: Optional

[



Callable

[



[



TReference

]

, 



int

]

] = None) -> GatherData[TReference, TGathered]: GatherData[TReference, TGathered]

Connect the data and reference streams used for gathering.  @param cameraFps: Camera frame rate used to derive timestamp matching tolerance     and polling interval. @type cameraFps: int @param inputData: Upstream output producing the data messages to gather. @type inputData: dai.Node.Output @param inputReference: Upstream output producing the reference messages. @type inputReference: dai.Node.Output @param waitCountFn: Optional callback returning the number of data messages     expected for a given reference. If omitted, defaults to     len(reference.detections). @type waitCountFn: Optional[Callable[[TReference], int]] @return: The configured node instance. @rtype: GatherData[TReference, TGathered]

method

setLowerThreshold(self, thresholdLow: int)

Poll both inputs, match messages by timestamp, and emit ready groups.

class

depthai_nodes.node.HostParsingNeuralNetwork(depthai_nodes.node.ParsingNeuralNetwork)

class

depthai_nodes.node.HostSpatialsCalc

method

__init__(self, calibData: dai.CalibrationHandler, depthAlignmentSocket: dai.CameraBoardSocket = dai.CameraBoardSocket.CAM_A, delta: int = 5, threshLow: int = 200, threshHigh: int = 30000)

variable

calibData

variable

depth_alignment_socket

variable

delta

variable

thresh_low

variable

thresh_high

method

Set the lower depth threshold used during ROI averaging.  @param thresholdLow: Lower accepted depth value. @type thresholdLow: int

method

setUpperThreshold(self, thresholdHigh: int)

Set the upper depth threshold used during ROI averaging.  @param thresholdHigh: Upper accepted depth value. @type thresholdHigh: int

method

setDeltaRoi(self, delta: int)

Set the half-size of the ROI used around point inputs.

method

calcSpatials(self, depthData: dai.ImgFrame, roi: List

[



int

], averagingMethod: Callable = np.mean) -> Dict[str, float]: Dict[str, float]

Calculate spatial coordinates from the depth frame within the ROI.  @param depthData: Depth frame used for coordinate estimation. @type depthData: dai.ImgFrame @param roi: Region of interest or point. @type roi: List[int] @param averagingMethod: Callable used to reduce valid depth values inside the     ROI. @type averagingMethod: Callable @return: Spatial coordinates in camera space. @rtype: Dict[str, float]

class

depthai_nodes.node.ImgDetectionsFilter(depthai_nodes.node.BaseHostNode)

method

setMaxDetections(self, maxDetections: int)

method

setLabels(self, labels: List

[



int

], keep: bool)

Deprecated wrapper for configuring label inclusion or exclusion.

method

setConfidenceThreshold(self, confidenceThreshold: float

|

None)

Deprecated wrapper for setting the minimum confidence.

method

Deprecated wrapper for limiting the number of detections.

method

setSortByConfidence(self, sortByConfidence: bool)

Deprecated wrapper for toggling confidence-based sorting.

method

setMinArea(self, minArea: float)

Deprecated wrapper for setting the minimum detection area.

method

keepLabels(self, labels: list

[



int

]) -> ImgDetectionsFilter: ImgDetectionsFilter

Keep only detections whose label is in ``labels``.

method

rejectLabels(self, labels: list

[



int

]) -> ImgDetectionsFilter: ImgDetectionsFilter

Drop detections whose label is in ``labels``.

method

minConfidence(self, threshold: float) -> ImgDetectionsFilter: ImgDetectionsFilter

Require detections to meet the minimum confidence threshold.

method

minArea(self, area: float) -> ImgDetectionsFilter: ImgDetectionsFilter

Require detections to meet the minimum normalized bounding-box area.

method

sortByConfidence(self, desc: bool = True) -> ImgDetectionsFilter: ImgDetectionsFilter

Enable sorting by confidence (before top-k).  Set direction via `desc`.

method

useNms(self, confThresh: float = 0.3, iouThresh: float = 0.4) -> ImgDetectionsFilter: ImgDetectionsFilter

Enable NMS after filtering and configure its thresholds.

method

enableSorting(self) -> ImgDetectionsFilter: ImgDetectionsFilter

Enable sorting using the last configured sort settings.

method

disableSorting(self) -> ImgDetectionsFilter: ImgDetectionsFilter

Disable sorting but keep the last configured sort settings.

method

takeFirstK(self, k: Optional

[



int

])

Keep only the first ``k`` detections after filtering and sorting.

method

build(self, input: dai.Node.Output) -> ImgDetectionsFilter: ImgDetectionsFilter

Connect the detections stream to the filter node.

method

process(self, msg: dai.Buffer)

Filter, optionally suppress, sort, and emit the detections message.

class

depthai_nodes.node.ImgFrameOverlay(depthai_nodes.node.BaseHostNode)

method

__init__(self, alpha: float = 0.5, preserveBackground: bool = False)

method

setAlpha(self, alpha: float)

Set the background contribution used during overlay.  @param alpha: Weight of the background frame in the blended output. @type alpha: float

method

setPreserveBackground(self, preserveBackground: bool)

Set whether zero-valued foreground pixels preserve the background.  @param preserveBackground: If True, zero areas in the foreground frame are     ignored in the output image. @type preserveBackground: bool

method

build(self, frame1: dai.Node.Output, frame2: dai.Node.Output, alpha: Optional

[



float

] = None, preserveBackground: Optional

[



bool

] = None) -> ImgFrameOverlay: ImgFrameOverlay

Connect the input streams and optionally update overlay settings.  @param frame1: Upstream output producing the background frame. @type frame1: dai.Node.Output @param frame2: Upstream output producing the foreground frame. @type frame2: dai.Node.Output @param alpha: Optional blend weight for the background frame. @type alpha: Optional[float] @param preserveBackground: Optional override for whether zero-valued foreground     pixels preserve the background frame. @type preserveBackground: Optional[bool] @return: The configured node instance. @rtype: ImgFrameOverlay

method

process(self, frame1: dai.Buffer, frame2: dai.Buffer)

Overlay the foreground frame onto the background frame.

class

depthai_nodes.node.InstanceToSemanticMask(depthai_nodes.node.BaseHostNode)

method

build(self, detections: dai.Node.Output) -> InstanceToSemanticMask: InstanceToSemanticMask

method

Connect the detections stream to the semantic-mask converter.

method

process(self, msg: dai.Buffer)

Convert instance IDs in the segmentation mask into class labels.

class

depthai_nodes.node.MessageCollector(depthai.node.ThreadedHostNode, typing.Generic)

method

Initializes the GatherData node.

property

setCameraFps(self, fps: int)

Return the gathered output stream.

method

Set the camera frame rate used for timestamp matching.  @param fps: Positive camera frame rate used for matching tolerance and polling. @type fps: int

method

build(self, cameraFps: int, inputData: dai.Node.Output) -> MessageCollector[TCollected]: MessageCollector[TCollected]

Connect the data and reference streams used for gathering.  @param cameraFps: Camera frame rate used to derive timestamp matching tolerance     and polling interval. @type cameraFps: int @param inputData: Upstream output producing the data messages to gather. @type inputData: dai.Node.Output @return: The configured node instance. @rtype: MessageCollector[TCollected]

method

DEVICE_PARSERS: list[str]

Poll both inputs, match messages by timestamp, and emit ready groups.

class

depthai_nodes.node.ParserGenerator(depthai.node.ThreadedHostNode)

constant

constant

DAI_SUPPORTED_YOLO_SUBTYPES

method

build(self, nnArchive: dai.NNArchive, headIndex: Optional

[



int

] = None, hostOnly: bool = False) -> Dict: Dict

Instantiate parser nodes for the supplied model archive.  @param nnArchive: Model archive describing the parser configuration. @type nnArchive: dai.NNArchive @param headIndex: Optional model head index to instantiate. If omitted, parsers     are created for all heads. @type headIndex: Optional[int] @param hostOnly: If True, prefer host-side parser implementations where     available. @type hostOnly: bool @return: Mapping of model head index to parser node. @rtype: Dict

method

No-op required by ``dai.node.ThreadedHostNode``.

class

depthai_nodes.node.BaseParser(depthai.node.ThreadedHostNode)

method

property

input

property

input.setter(self, node: dai.Node.Input)

method

Linking point to which the Neural Network's output is linked.

method

out.setter(self, node: dai.Node.Output)

Output node to which the processed network results are sent in the form of a DepthAI message.

method

build(self, head_config: Dict

[



str

, 



Any

]) -> BaseParser: BaseParser

Configures the parser based on the specified head configuration.  @param head_config: A dictionary containing configuration details relevant to     the parser, including parameters and settings required for output parsing. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: BaseParser

method

Parses the output from the neural network head.  This method should be overridden by subclasses to implement the specific parsing logic. It accepts arbitrary keyword arguments for flexibility.  @param kwargs: Arbitrary keyword arguments for the parsing process. @type kwargs: Any @return message: The parsed output message, as defined by the logic in the     subclass. @rtype message: Any

class

depthai_nodes.node.ClassificationParser(depthai_nodes.node.BaseParser)

method

__init__(self, output_layer_name: str = '', classes: List

[



str

] = None, is_softmax: bool = True)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes: List of class names to be used for linking with their respective     scores. Expected to be in the same order as Neural Network's output. If not     provided, the message will only return sorted scores. @type classes: List[str] @param is_softmax: If False, the scores are converted to probabilities using     softmax function. @type is_softmax: bool

variable

variable

variable

variable

method

setSoftmax(self, is_softmax: bool)

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

setClasses(self, classes: List

[



str

])

Sets the class names for the classification model.  @param classes: List of class names to be used for linking with their respective     scores. @type classes: List[str]

method

Sets the softmax flag for the classification model.  @param is_softmax: If False, the parser will convert the scores to probabilities     using softmax function. @type is_softmax: bool

method

build(self, head_config: Dict

[



str

, 



Any

]) -> ClassificationParser: ClassificationParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: ClassificationParser

method

setRemoveDuplicates(self, remove_duplicates: bool)

class

depthai_nodes.node.ClassificationSequenceParser(depthai_nodes.node.ClassificationParser)

method

__init__(self, output_layer_name: str = '', classes: List

[



str

] = None, is_softmax: bool = True, ignored_indexes: List

[



int

] = None, remove_duplicates: bool = False, concatenate_classes: bool = False)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes: List of available classes for the model. @type classes: List[str] @param ignored_indexes: List of indexes to ignore during classification     generation (e.g., background class, blank space). @type ignored_indexes: List[int] @param is_softmax: If False, the scores are converted to probabilities using     softmax function. @type is_softmax: bool @param remove_duplicates: If True, removes consecutive duplicates from the     sequence. @type remove_duplicates: bool @param concatenate_classes: If True, concatenates consecutive words based on the     predicted spaces. @type concatenate_classes: bool

variable

ignored_indexes

variable

remove_duplicates

variable

concatenate_classes

method

Sets the remove_duplicates flag for the classification sequence model.  @param remove_duplicates: If True, removes consecutive duplicates from the     sequence. @type remove_duplicates: bool

method

setIgnoredIndexes(self, ignored_indexes: List

[



int

])

Sets the ignored_indexes for the classification sequence model.  @param ignored_indexes: A list of indexes to ignore during classification     generation. @type ignored_indexes: List[int]

method

setConcatenateClasses(self, concatenate_classes: bool)

Sets the concatenate_classes flag for the classification sequence model.  @param concatenate_classes: If True, concatenates consecutive classes into a     single string. Used mostly for text processing. @type concatenate_classes: bool

method

build(self, head_config: Dict

[



str

, 



Any

]) -> ClassificationSequenceParser: ClassificationSequenceParser

Configures the parser.  @param head_config: The head configuration for the parser. The required keys are `classes`, `n_classes`, and `is_softmax`. In addition to these, there are three optional keys that are mostly used for text processing: `ignored_indexes`, `remove_duplicates` and `concatenate_classes`. @type head_config: Dict[str, Any]  @return: Returns the instantiated parser with the correct configuration. @rtype: ClassificationParser

method

variable

setConfidenceThreshold(self, threshold: float)

class

depthai_nodes.node.DetectionParser(depthai_nodes.node.BaseParser)

method

__init__(self, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, label_names: Optional

[



List

[



str

]

] = None)

Initializes the parser node.  @param conf_threshold: Confidence score threshold of detected bounding boxes. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int

variable

variable

variable

variable

method

Sets the confidence score threshold for detected objects.  @param threshold: Confidence score threshold for detected objects. @type threshold: float

method

setIouThreshold(self, threshold: float)

Sets the non-maximum suppression threshold.  @param threshold: Non-maximum suppression threshold. @type threshold: float

method

setMaxDetections(self, max_det: int)

Sets the maximum number of detections to keep.  @param max_det: Maximum number of detections to keep. @type max_det: int

method

setLabelNames(self, label_names: List

[



str

])

Sets the label names for detected objects.  @param label_names: List of label names for detected objects. @type label_names: List[str]

method

build(self, head_config) -> DetectionParser: DetectionParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: DetectionParser

method

class

depthai_nodes.node.EmbeddingsParser(depthai_nodes.node.BaseParser)

method

Initialize the EmbeddingsParser node.

variable

setOutputLayerNames(self, output_layer_name: str)

method

Sets the output layer name for the parser.  @param output_layer_name: The output layer name for the parser. @type output_layer_name: str

method

build(self, head_config: Dict

[



str

, 



Any

]) -> EmbeddingsParser: EmbeddingsParser

Sets the head configuration for the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: EmbeddingsParser

method

setConfidenceThreshold(self, threshold: float)

class

depthai_nodes.node.FastSAMParser(depthai_nodes.node.BaseParser)

method

__init__(self, conf_threshold: float = 0.5, n_classes: int = 1, iou_threshold: float = 0.5, mask_conf: float = 0.5, prompt: str = 'everything', points: Optional

[



Tuple

[



int

, 



int

]

] = None, point_label: Optional

[



int

] = None, bbox: Optional

[



Tuple

[



int

, 



int

, 



int

, 



int

]

] = None, yolo_outputs: List

[



str

] = None, mask_outputs: List

[



str

] = None, protos_output: str = 'protos_output')

Initializes the parser node.  @param conf_threshold: The confidence threshold for the detections @type conf_threshold: float @param n_classes: The number of classes in the model @type n_classes: int @param iou_threshold: The intersection over union threshold @type iou_threshold: float @param mask_conf: The mask confidence threshold @type mask_conf: float @param prompt: The prompt type @type prompt: str @param points: The points @type points: Optional[Tuple[int, int]] @param point_label: The point label @type point_label: Optional[int] @param bbox: The bounding box @type bbox: Optional[Tuple[int, int, int, int]] @param yolo_outputs: The YOLO outputs @type yolo_outputs: List[str] @param mask_outputs: The mask outputs @type mask_outputs: List[str] @param protos_output: The protos output @type protos_output: str

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

method

Sets the confidence score threshold.  @param threshold: Confidence score threshold. @type threshold: float

method

setNumClasses(self, n_classes: int)

Sets the number of classes in the model.  @param numClasses: The number of classes in the model. @type numClasses: int

method

setIouThreshold(self, iou_threshold: float)

Sets the intersection over union threshold.  @param iou_threshold: The intersection over union threshold. @type iou_threshold: float

method

setMaskConfidence(self, mask_conf: float)

Sets the mask confidence threshold.  @param mask_conf: The mask confidence threshold. @type mask_conf: float

method

setPrompt(self, prompt: str)

Sets the prompt type.  @param prompt: The prompt type @type prompt: str

method

setPoints(self, points: Tuple

[



int

, 



int

])

Sets the points.  @param points: The points @type points: Tuple[int, int]

method

setPointLabel(self, point_label: int)

Sets the point label.  @param point_label: The point label @type point_label: int

method

setBoundingBox(self, bbox: Tuple

[



int

, 



int

, 



int

, 



int

])

Sets the bounding box.  @param bbox: The bounding box @type bbox: Tuple[int, int, int, int]

method

setYoloOutputs(self, yolo_outputs: List

[



str

])

Sets the YOLO outputs.  @param yolo_outputs: The YOLO outputs @type yolo_outputs: List[str]

method

setMaskOutputs(self, mask_outputs: List

[



str

])

Sets the mask outputs.  @param mask_outputs: The mask outputs @type mask_outputs: List[str]

method

setProtosOutput(self, protos_output: str)

Sets the protos output.  @param protos_output: The protos output @type protos_output: str

method

build(self, head_config: Dict

[



str

, 



Any

]) -> FastSAMParser: FastSAMParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: FastSAMParser

method

__init__(self, output_layer_name: str = '', output_is_bgr: bool = False)

class

depthai_nodes.node.ImageOutputParser(depthai_nodes.node.BaseParser)

method

Initializes the parser node.  param output_layer_name: Name of the output layer relevant to the parser. type output_layer_name: str @param output_is_bgr: Flag indicating if the output image is in BGR. @type output_is_bgr: bool

variable

variable

output_is_bgr

method

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

setBGROutput(self)

Sets the flag indicating that output image is in BGR.

method

build(self, head_config: Dict

[



str

, 



Any

]) -> ImageOutputParser: ImageOutputParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: ImageOutputParser

method

class

depthai_nodes.node.LaneDetectionParser(depthai_nodes.node.BaseParser)

method

__init__(self, output_layer_name: str = '', row_anchors: List

[



int

] = None, griding_num: int = None, cls_num_per_lane: int = None, input_size: Tuple

[



int

, 



int

] = None)

Initializes the lane detection parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param row_anchors: List of row anchors. @type row_anchors: List[int] @param griding_num: Griding number. @type griding_num: int @param cls_num_per_lane: Number of points per lane. @type cls_num_per_lane: int @param input_size: Input size (width,height). @type input_size: Tuple[int, int]

variable

variable

variable

variable

variable

method

setGridingNum(self, griding_num: int)

Set the output layer name for the lane detection model.  @param output_layer_name: Name of the output layer. @type output_layer_name: str

method

setRowAnchors(self, row_anchors: List

[



int

])

Set the row anchors for the lane detection model.  @param row_anchors: List of row anchors. @type row_anchors: List[int]

method

Set the griding number for the lane detection model.  @param griding_num: Griding number. @type griding_num: int

method

setClsNumPerLane(self, cls_num_per_lane: int)

Set the number of points per lane for the lane detection model.  @param cls_num_per_lane: Number of classes per lane. @type cls_num_per_lane: int

method

setInputSize(self, input_size: Tuple

[



int

, 



int

])

Set the input size for the lane detection model.  @param input_size: Input size (width,height). @type input_size: Tuple[int, int]

method

build(self, head_config: Dict

[



str

, 



Any

]) -> LaneDetectionParser: LaneDetectionParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: LaneDetectionParser

variable

input_shape

variable

layout

method

__init__(self, output_layer_name: str = '', min_max_scaling: bool = False)

class

depthai_nodes.node.MapOutputParser(depthai_nodes.node.BaseParser)

method

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param min_max_scaling: If True, the map is scaled to the range [0, 1]. @type min_max_scaling: bool

variable

min_max_scaling

variable

method

setMinMaxScaling(self, min_max_scaling: bool)

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

Sets the min_max_scaling flag.  @param min_max_scaling: If True, the map is scaled to the range [0, 1]. @type min_max_scaling: bool

method

build(self, head_config: Dict

[



str

, 



Any

]) -> MapOutputParser: MapOutputParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MapOutputParser

method

setScale(self, scale: int)

class

depthai_nodes.node.MPPalmDetectionParser(depthai_nodes.node.DetectionParser)

method

__init__(self, output_layer_names: List

[



str

] = None, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, scale: int = 192)

Initializes the parser node.  @param output_layer_names: Names of the output layers relevant to the parser. @type output_layer_names: List[str] @param conf_threshold: Confidence score threshold for detected hands. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param scale: Scale of the input image. @type scale: int

variable

variable

variable

variable

variable

variable

method

setOutputLayerNames(self, output_layer_names: List

[



str

])

Sets the output layer name(s) for the parser.  @param output_layer_names: The name of the output layer(s) from which the scores     are extracted. @type output_layer_names: List[str]

method

Sets the scale of the input image.  @param scale: Scale of the input image. @type scale: int

method

build(self, head_config: Dict

[



str

, 



Any

]) -> MPPalmDetectionParser: MPPalmDetectionParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MPPalmDetectionParser

method

setOutputLayerTPMap(self, output_layer_tpmap: str)

class

depthai_nodes.node.MLSDParser(depthai_nodes.node.BaseParser)

method

__init__(self, output_layer_tpmap: str = '', output_layer_heat: str = '', topk_n: int = 200, score_thr: float = 0.1, dist_thr: float = 20.0)

Initializes the parser node.  @param topk_n: Number of top candidates to keep. @type topk_n: int @param score_thr: Confidence score threshold for detected lines. @type score_thr: float @param dist_thr: Distance threshold for merging lines. @type dist_thr: float

variable

variable

variable

variable

variable

method

Sets the name of the output layer containing the tpMap tensor.  @param output_layer_tpmap: Name of the output layer containing the tpMap tensor. @type output_layer_tpmap: str

method

setOutputLayerHeat(self, output_layer_heat: str)

Sets the name of the output layer containing the heat tensor.  @param output_layer_heat: Name of the output layer containing the heat tensor. @type output_layer_heat: str

method

setTopK(self, topk_n: int)

Sets the number of top candidates to keep.  @param topk_n: Number of top candidates to keep. @type topk_n: int

method

setScoreThreshold(self, score_thr: float)

Sets the confidence score threshold for detected lines.  @param score_thr: Confidence score threshold for detected lines. @type score_thr: float

method

setDistanceThreshold(self, dist_thr: float)

Sets the distance threshold for merging lines.  @param dist_thr: Distance threshold for merging lines. @type dist_thr: float

method

build(self, head_config: Dict

[



str

, 



Any

]) -> MLSDParser: MLSDParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MLSDParser

method

class

depthai_nodes.node.PPTextDetectionParser(depthai_nodes.node.DetectionParser)

method

__init__(self, output_layer_name: str = '', conf_threshold: float = 0.5, mask_threshold: float = 0.25, max_det: int = 100)

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param conf_threshold: The threshold for bounding boxes. @type conf_threshold: float @param mask_threshold: The threshold for the mask. @type mask_threshold: float @param max_det: The maximum number of candidate bounding boxes. @type max_det:

variable

mask_threshold

variable

method

setMaskThreshold(self, mask_threshold: float = 0.25)

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

Sets the mask threshold for creating the mask from model output probabilities.  @param threshold: The threshold for the mask. @type threshold: float

method

build(self, head_config: Dict

[



str

, 



Any

]) -> PPTextDetectionParser: PPTextDetectionParser

Configures the parser.  @param config: The head configuration for the parser. @type config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: PPTextDetectionParser

method

__init__(self, output_layer_name: str = '')

class

depthai_nodes.node.RegressionParser(depthai_nodes.node.BaseParser)

method

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name : str

variable

method

Sets the name of the output layer.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str

method

build(self, head_config: Dict

[



str

, 



Any

]) -> RegressionParser: RegressionParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: RegressionParser

method

setNumAnchors(self, num_anchors: int)

class

depthai_nodes.node.SCRFDParser(depthai_nodes.node.DetectionParser)

method

__init__(self, output_layer_names: List

[



str

] = None, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, input_size: Tuple

[



int

, 



int

] = (640, 640), feat_stride_fpn: Tuple = (8, 16, 32), num_anchors: int = 2)

Initializes the parser node.  @param output_layer_names: Names of the output layers relevant to the parser. @type output_layer_names: List[str] @param conf_threshold: Confidence score threshold for detected faces. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param input_size: Input size of the model. @type input_size: tuple @param feat_stride_fpn: List of the feature strides. @type feat_stride_fpn: tuple @param num_anchors: Number of anchors. @type num_anchors: int

variable

variable

variable

variable

variable

method

setOutputLayerNames(self, output_layer_names: List

[



str

])

Sets the output layer name(s) for the parser.  @param output_layer_names: The name of the output layer(s) to be used. @type output_layer_names: List[str]

method

setInputSize(self, input_size: Tuple

[



int

, 



int

])

Sets the input size of the model.  @param input_size: Input size of the model. @type input_size: list

method

setFeatStrideFPN(self, feat_stride_fpn: List

[



int

])

Sets the feature stride of the FPN.  @param feat_stride_fpn: Feature stride of the FPN. @type feat_stride_fpn: list

method

Sets the number of anchors.  @param num_anchors: Number of anchors. @type num_anchors: int

method

build(self, head_config: Dict

[



str

, 



Any

]) -> SCRFDParser: SCRFDParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SCRFDParser

method

__init__(self, output_layer_name: str = '', classes_in_one_layer: bool = False)

class

depthai_nodes.node.SegmentationParser(depthai_nodes.node.BaseParser)

method

Initializes the parser node.  @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes_in_one_layer: Whether all classes are in one layer in the multi-     class segmentation model. Default is False. If True, the parser will use     np.max instead of np.argmax to get the class map. @type classes_in_one_layer: bool

variable

variable

classes_in_one_layer

method

setClassesInOneLayer(self, classes_in_one_layer: bool)

Sets the name of the output layer.  @param output_layer_name: The name of the output layer. @type output_layer_name: str

method

Sets the flag indicating whether all classes are in one layer.  @param classes_in_one_layer: Whether all classes are in one layer. @type classes_in_one_layer: bool

method

build(self, head_config: Dict

[



str

, 



Any

]) -> SegmentationParser: SegmentationParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SegmentationParser

method

class

depthai_nodes.node.XFeatMonoParser(depthai_nodes.node.parsers.xfeat.XFeatBaseParser)

method

__init__(self, output_layer_feats: str = 'feats', output_layer_keypoints: str = 'keypoints', output_layer_heatmaps: str = 'heatmaps', original_size: Tuple

[



float

, 



float

] = None, input_size: Tuple

[



float

, 



float

] = (640, 352), max_keypoints: int = 4096)

Initializes the XFeatParser node.  @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str @param original_size: Original image size. @type original_size: Tuple[float, float] @param input_size: Input image size. @type input_size: Tuple[float, float] @param max_keypoints: Maximum number of keypoints to keep. @type max_keypoints: int

variable

previous_results

variable

trigger

method

setTrigger(self)

Sets the trigger to set the reference frame.

method

class

depthai_nodes.node.XFeatStereoParser(depthai_nodes.node.parsers.xfeat.XFeatBaseParser)

method

__init__(self, output_layer_feats: str = 'feats', output_layer_keypoints: str = 'keypoints', output_layer_heatmaps: str = 'heatmaps', original_size: Tuple

[



float

, 



float

] = None, input_size: Tuple

[



float

, 



float

] = (640, 352), max_keypoints: int = 4096)

Initializes the XFeatParser node.  @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str @param original_size: Original image size. @type original_size: Tuple[float, float] @param input_size: Input image size. @type input_size: Tuple[float, float] @param max_keypoints: Maximum number of keypoints to keep. @type max_keypoints: int

method

setConfidenceThreshold(self, threshold: float)

class

depthai_nodes.node.YOLOExtendedParser(depthai_nodes.node.BaseParser)

method

__init__(self, conf_threshold: float = 0.5, n_classes: int = 1, label_names: Optional

[



List

[



str

]

] = None, iou_threshold: float = 0.5, mask_conf: float = 0.5, n_keypoints: int = 17, max_det: int = 300, anchors: Optional

[



List

[



List

[



List

[



float

]

]

]

] = None, subtype: str = '', keypoint_label_names: Optional

[



List

[



str

]

] = None, keypoint_edges: Optional

[



List

[



Tuple

[



int

, 



int

]

]

] = None)

Initialize the parser node.  @param conf_threshold: The confidence threshold for the detections @type conf_threshold: float @param n_classes: The number of classes in the model @type n_classes: int @param label_names: The names of the classes @type label_names: Optional[List[str]] @param iou_threshold: The intersection over union threshold @type iou_threshold: float @param mask_conf: The mask confidence threshold @type mask_conf: float @param n_keypoints: The number of keypoints in the model @type n_keypoints: int @param anchors: The anchors for the YOLO model @type anchors: Optional[List[List[List[float]]]] @param subtype: The version of the YOLO model @type subtype: str @param keypoint_label_names: The labels for the keypoints @type keypoint_label_names: Optional[List[str]] @param keypoint_edges: Connection pairs of the keypoints. Example: [(0,1),     (1,2), (2,3), (3,0)] shows that keypoint 0 is connected to keypoint 1,     keypoint 1 is connected to keypoint 2, etc. @type keypoint_edges: Optional[List[Tuple[int, int]]]

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

variable

method

setOutputLayerNames(self, output_layer_names: List

[



str

])

Sets the output layer names for the parser.  @param output_layer_names: The output layer names for the parser. @type output_layer_names: List[str]

method

Sets the confidence score threshold for detected faces.  @param threshold: Confidence score threshold for detected faces. @type threshold: float

method

setNumClasses(self, n_classes: int)

Sets the number of classes in the model.  @param numClasses: The number of classes in the model. @type numClasses: int

method

setIouThreshold(self, iou_threshold: float)

Sets the intersection over union threshold.  @param iou_threshold: The intersection over union threshold. @type iou_threshold: float

method

setMaskConfidence(self, mask_conf: float)

Sets the mask confidence threshold.  @param mask_conf: The mask confidence threshold. @type mask_conf: float

method

setNumKeypoints(self, n_keypoints: int)

Sets the number of keypoints in the model.  @param n_keypoints: The number of keypoints in the model. @type n_keypoints: int

method

setAnchors(self, anchors: List

[



List

[



List

[



float

]

]

])

Sets the anchors for the YOLO model.  @param anchors: The anchors for the YOLO model. @type anchors: List[List[List[float]]]

method

setSubtype(self, subtype: str)

Sets the subtype of the YOLO model.  @param subtype: The subtype of the YOLO model. @type subtype: YOLOSubtype

method

setLabelNames(self, label_names: List

[



str

])

Sets the names of the classes.  @param label_names: The names of the classes. @type label_names: List[str]

method

setKeypointLabelNames(self, keypoint_label_names: List

[



str

])

Sets the label names for the keypoints.  @param keypoint_label_names: The labels for the keypoints. @type keypoint_label_names: List[str]

method

setKeypointEdges(self, keypoint_edges: List

[



Tuple

[



int

, 



int

]

])

Sets the edges for the keypoints.  @param keypoint_edges: The edges for the keypoints. @type keypoint_edges: List[Tuple[int, int]]

method

build(self, head_config: Dict

[



str

, 



Any

])

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: YOLOExtendedParser

variable

n_prototypes

method

setOutputLayerLoc(self, loc_output_layer_name: str)

class

depthai_nodes.node.YuNetParser(depthai_nodes.node.DetectionParser)

method

__init__(self, conf_threshold: float = 0.8, iou_threshold: float = 0.3, max_det: int = 5000, input_size: Tuple

[



int

, 



int

] = None, loc_output_layer_name: str = None, conf_output_layer_name: str = None, iou_output_layer_name: str = None)

Initializes the parser node.  @param conf_threshold: Confidence score threshold for detected faces. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param input_size: Input size of the model (width, height). @type input_size: Tuple[int, int] @param loc_output_layer_name: Output layer name for the location predictions. @type loc_output_layer_name: str @param conf_output_layer_name: Output layer name for the confidence predictions. @type conf_output_layer_name: str @param iou_output_layer_name: Output layer name for the IoU predictions. @type iou_output_layer_name: str

variable

loc_output_layer_name

variable

conf_output_layer_name

variable

iou_output_layer_name

variable

input_size

variable

label_names

method

setInputSize(self, input_size: Tuple

[



int

, 



int

])

Sets the input size of the model.  @param input_size: Input size of the model (width, height). @type input_size: list

method

Sets the name of the output layer containing the location predictions.  @param loc_output_layer_name: Output layer name for the loc tensor. @type loc_output_layer_name: str

method

setOutputLayerConf(self, conf_output_layer_name: str)

Sets the name of the output layer containing the confidence predictions.  @param conf_output_layer_name: Output layer name for the conf tensor. @type conf_output_layer_name: str

method

setOutputLayerIou(self, iou_output_layer_name: str)

Sets the name of the output layer containing the IoU predictions.  @param iou_output_layer_name: Output layer name for the IoU tensor. @type iou_output_layer_name: str

method

build(self, head_config: Dict

[



str

, 



Any

]) -> YuNetParser: YuNetParser

Configures the parser.  @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: YuNetParser

variable

input_shape

variable

layout

method

__init__(self, args, kwargs)

class

depthai_nodes.node.ParsingNeuralNetwork(depthai.node.ThreadedHostNode)

method

Initialize the wrapper and create the internal neural-network node.

property

input

Return the primary neural-network input.

property

inputs

Return the neural-network input map.

property

getNumInferenceThreads(self) -> int: int

Return the single parser output when the model has exactly one head.

property

outputs

Neural network output having dai.MessageGroup as a payload which contains outputs of all model heads and can be accessed as a dictionary with str(model head index) as a key.  Can be used only when there are at least two model heads. Otherwise, out property must be used.

property

passthrough

Return the primary passthrough stream from the underlying NN node.

property

passthroughs

Return the passthrough output map from the underlying NN node.

method

Return the configured number of inference threads.

method

getParser(self, args, kwargs) -> Union[BaseParser, dai.DeviceNode]: Union[BaseParser, dai.DeviceNode]

Return the parser for the requested model head.  @param parserType: Optional expected parser type used for runtime type checking. @type parserType: Type[TParser] @param index: Model head index. Defaults to 0. @type index: int @return: Parser node matching the requested head. @rtype: BaseParser | dai.DeviceNode

method

getOutput(self, head: int) -> dai.Node.Output: dai.Node.Output

Return the output stream for the specified model head.

method

setBackend(self, setBackend: str)

Set the backend used by the underlying neural-network node.

method

setBackendProperties(self, setBackendProperties: Dict

[



str

, 



str

])

Set backend-specific properties on the underlying neural-network node.

method

setBlob(self, blob: Union

[



Path

, 



dai.OpenVINO.Blob

])

Set the blob used by the underlying neural-network node.

method

setBlobPath(self, path: Path)

Set the blob path used by the underlying neural-network node.

method

setFromModelZoo(self, description: dai.NNModelDescription, useCached: bool)

Load the model from the model zoo into the underlying NN node.

method

setModelPath(self, modelPath: Path)

Set the model path used by the underlying neural-network node.

method

setNNArchive(self, nnArchive: dai.NNArchive, numShaves: Optional

[



int

] = None)

Set the active model archive and rebuild parser nodes.  @param nnArchive: Neural-network archive containing the model and parser config. @type nnArchive: dai.NNArchive @param numShaves: Optional number of shaves allocated to the neural-network     node. @type numShaves: Optional[int]

method

setNumInferenceThreads(self, numThreads: int)

Sets the number of inference threads of the NeuralNetwork node.

method

setNumNCEPerInferenceThread(self, numNCEPerThread: int)

Sets the number of NCE per inference thread of the NeuralNetwork node.

method

setNumPoolFrames(self, numFrames: int)

Sets the number of pool frames of the NeuralNetwork node.

method

setNumShavesPerInferenceThread(self, numShavesPerInferenceThread: int)

Sets the number of shaves per inference thread of the NeuralNetwork node.

method

build(self, input: Union

[



dai.Node.Output

, 



dai.node.Camera

], nnSource: Union

[



dai.NNModelDescription

, 



dai.NNArchive

, 



str

], fps: Optional

[



float

] = None) -> ParsingNeuralNetwork: ParsingNeuralNetwork

Build the neural-network node and create parser nodes for each head.  @param input: Upstream output or camera feeding frames into the neural-network     node. @type input: Union[dai.Node.Output, dai.node.Camera] @param nnSource: dai.NNModelDescription, dai.NNArchive, or HubAI model slug. @type nnSource: Union[dai.NNModelDescription, dai.NNArchive, str] @param fps: Optional runtime FPS limit for the neural-network node. @type fps: Optional[float] @return: The configured node instance. @rtype: ParsingNeuralNetwork

method

Methods inherited from ThreadedHostNode.  Method runs with start of the pipeline.

method

cleanup(self)

Cleans up the ParsingNeuralNetwork node and removes all nodes created by ParsingNeuralNetwork from the pipeline.  Must be called before removing the node from the pipeline.

class

depthai_nodes.node.SnapsUploader(depthai_nodes.node.BaseHostNode)

method

setToken(self, token: str)

method

Set the Hub API token used for snap uploads.

method

setCacheDir(self, cacheDir: str)

Set the cache directory for storing cached data.  By default, the cache directory is set to /internal/private

method

setCacheIfCannotSend(self, cacheIfCannotUpload: bool)

Set whether to cache data if it cannot be sent.  By default, cacheIfCannotSend is set to false

method

setLogResponse(self, logResponse: bool)

Set whether to log the responses from the server.  By default, logResponse is set to false. Logs are visible in DepthAI logs with INFO level

method

build(self, snaps: dai.Node.Output)

method

process(self, snap: dai.Buffer)

Upload the incoming ``SnapData`` payload to the Hub Events API.

class

depthai_nodes.node.Tiling(depthai_nodes.node.BaseThreadedHostNode)

method

property

Return the output stream of tile configuration groups.

property

tileCount

Return the number of tiles in the current configuration.

property

tilePositions

method

updateTilingConfig(self, overlap: Optional

[



float

] = None, gridSize: Optional

[



Tuple

[



int

, 



int

]

] = None, canvasShape: Optional

[



Tuple

[



int

, 



int

]

] = None, resizeShape: Optional

[



Tuple

[



int

, 



int

]

] = None, resizeMode: Optional

[



dai.ImageManipConfig.ResizeMode

] = None, globalDetection: Optional

[



bool

] = None, gridMatrix: Optional

[



Union

[



np.ndarray

, 



List

, 



None

]

] = None)

Update the tiling configuration used for future trigger messages.  @param overlap: Fractional overlap between adjacent tiles in the range [0, 1). @type overlap: Optional[float] @param gridSize: Tile grid as (columns, rows). @type gridSize: Optional[Tuple[int, int]] @param canvasShape: Shape of the image space the tiling is defined on. Crop     coordinates are computed in this absolute coordinate system. @type canvasShape: Optional[Tuple[int, int]] @param resizeShape: Output size applied to each tile after cropping. This is the     shape expected by downstream consumers, not necessarily a neural network. @type resizeShape: Optional[Tuple[int, int]] @param resizeMode: Resize strategy used when adapting each crop to resizeShape. @type resizeMode: Optional[dai.ImageManipConfig.ResizeMode] @param globalDetection: If True, prepend a config covering the whole canvas. @type globalDetection: Optional[bool] @param gridMatrix: Optional grouping matrix for merging neighboring grid cells     into larger crops. @type gridMatrix: Optional[Union[np.ndarray, List, None]]

method

build(self, overlap: float, gridSize: Tuple

[



int

, 



int

], canvasShape: Tuple

[



int

, 



int

], resizeShape: Tuple

[



int

, 



int

], resizeMode: dai.ImageManipConfig.ResizeMode, globalDetection: bool = False, gridMatrix: Union

[



np.ndarray

, 



List

, 



None

] = None) -> Tiling: Tiling

Configure the tiling node and link the trigger stream.  @param overlap: Fractional overlap between adjacent tiles in the range [0, 1). @type overlap: float @param gridSize: Tile grid as (columns, rows). @type gridSize: Tuple[int, int] @param canvasShape: Shape of the image space the tiling is defined on. Crop     coordinates are computed in this absolute coordinate system. @type canvasShape: Tuple[int, int] @param resizeShape: Output size applied to each tile after cropping. This is the     shape expected by downstream consumers, not necessarily a neural network. @type resizeShape: Tuple[int, int] @param resizeMode: Resize strategy used when adapting each crop to resizeShape. @type resizeMode: dai.ImageManipConfig.ResizeMode @param globalDetection: If True, prepend a config covering the whole canvas. @type globalDetection: bool @param gridMatrix: Optional grouping matrix for merging neighboring grid cells     into larger crops. @type gridMatrix: Union[np.ndarray, List, None] @return: The configured node instance. @rtype: Tiling

method