DepthAI Nodes
Overview
An open-source library of Python host nodes that can be utilized in DepthAI pipelines. The source code is available on GitHub.Installation
The package is hosted on PyPI so it can be installed with pip:Command Line
1pip install depthai-nodes
API Reference
0.2.1
package
depthai_nodes
module
depthai_nodes.constants
module
depthai_nodes.logging
class
function
function
setup_logging(level: Optional
[
str
] = None, file: Optional
[
str
] = None)
Globally configures logging for depthai_nodes package. @type level: str or None @param level: Logging level. One of "CRITICAL", "DEBUG", "ERR", "INFO", and "WARN". Can be changed using "DEPTHAI_NODES_LEVEL" env variable. If not set defaults to "DEPTHAI_LEVEL" if set or "WARN". @type file: str or None @param file: Path to a file where logs will be saved. If None, logs will not be saved. Defaults to None.
function
class
depthai_nodes.logging.LogLevel(enum.Enum)
package
depthai_nodes.message
module
module
package
module
module
module
module
module
module
class
Classifications
Classification class for storing the classes and their respective scores. Attributes ---------- classes : list[str] A list of classes. scores : NDArray[np.float32] Corresponding probability scores. transformation : dai.ImgTransformation Image transformation object.
class
Cluster
Cluster class for storing a cluster. Attributes ---------- label : int Label of the cluster. points : List[dai.Point2f] List of points in the cluster.
class
Clusters
Clusters class for storing clusters. Attributes ---------- clusters : List[Cluster] List of clusters. transformation : dai.ImgTransformation Image transformation object.
class
ImgDetectionExtended
A class for storing image detections in (x_center, y_center, width, height) format with additional angle, label and keypoints. Attributes ---------- rotated_rect: dai.RotatedRect Rotated rectangle object defined by the center, width, height and angle in degrees. confidence: float Confidence of the detection. label: int Label of the detection. label_name: str The corresponding label name if available. keypoints: List[Keypoint] Keypoints of the detection.
class
ImgDetectionsExtended
ImgDetectionsExtended class for storing image detections with keypoints. Attributes ---------- detections: List[ImgDetectionExtended] Image detections with keypoints. masks: np.ndarray The segmentation masks of the image. All masks are stored in a single numpy array. transformation : dai.ImgTransformation Image transformation object.
class
Keypoint
Keypoint class for storing a keypoint. Attributes ---------- x: float X coordinate of the keypoint, relative to the input height. y: float Y coordinate of the keypoint, relative to the input width. z: Optional[float] Z coordinate of the keypoint. confidence: Optional[float] Confidence of the keypoint.
class
Keypoints
Keypoints class for storing keypoints. Attributes ---------- keypoints: List[Keypoint] List of Keypoint objects, each representing a keypoint. transformation : dai.ImgTransformation Image transformation object.
class
Line
Line class for storing a line. Attributes ---------- start_point : dai.Point2f Start point of the line with x and y coordinate. end_point : dai.Point2f End point of the line with x and y coordinate. confidence : float Confidence of the line.
class
Lines
Lines class for storing lines. Attributes ---------- lines : List[Line] List of detected lines. transformation : dai.ImgTransformation Image transformation object.
class
Map2D
Map2D class for storing a 2D map of floats. Attributes ---------- map : NDArray[np.float32] 2D map. width : int 2D Map width. height : int 2D Map height. transformation : dai.ImgTransformation Image transformation object.
class
Prediction
Prediction class for storing a prediction. Attributes ---------- prediction : float The predicted value.
class
Predictions
Predictions class for storing predictions. Attributes ---------- predictions : List[Prediction] List of predictions. transformation : dai.ImgTransformation Image transformation object.
class
SegmentationMask
SegmentationMask class for a single- or multi-object segmentation mask. Unassigned pixels are represented with "-1" and foreground classes with non-negative integers. Attributes ---------- mask: NDArray[np.int16] Segmentation mask. transformation : dai.ImgTransformation Image transformation object.
package
depthai_nodes.message.creators
module
module
module
module
module
module
module
module
module
module
function
create_classification_message(classes: List
[
str
], scores: Union
[
np.ndarray
,
List
]) -> Classifications: Classifications
Create a message for classification. The message contains the class names and their respective scores, sorted in descending order of scores. @param classes: A list containing class names. @type classes: List[str] @param scores: A numpy array of shape (n_classes,) containing the probability score of each class. @type scores: np.ndarray @return: A message with attributes `classes` and `scores`. `classes` is a list of classes, sorted in descending order of scores. `scores` is a list of the corresponding scores. @rtype: Classifications @raises ValueError: If the provided classes are None. @raises ValueError: If the provided classes are not a list. @raises ValueError: If the provided classes are empty. @raises ValueError: If the provided scores are None. @raises ValueError: If the provided scores are not a list or a numpy array. @raises ValueError: If the provided scores are empty. @raises ValueError: If the provided scores are not a 1D array. @raises ValueError: If the provided scores are not of type float. @raises ValueError: If the provided scores do not sum to 1. @raises ValueError: If the number of labels and scores mismatch.
function
create_classification_sequence_message(classes: List
[
str
], scores: Union
[
np.ndarray
,
List
], ignored_indexes: Optional
[
List
[
int
]
] = None, remove_duplicates: bool = False, concatenate_classes: bool = False) -> Classifications: Classifications
Creates a message for a multi-class sequence. The message contains the class names and their respective scores, ordered according to the sequence. The 'scores' array is a sequence of probabilities for each class at each position in the sequence. @param classes: A list of class names, with length 'n_classes'. @type classes: List @param scores: A numpy array of shape (sequence_length, n_classes) containing the (row-wise) probability distributions over the classes. @type scores: np.ndarray @param ignored_indexes: A list of indexes to ignore during classification generation (e.g., background class, padding class). Defaults to None. @type ignored_indexes: Optional[List[int]] @param remove_duplicates: If True, removes consecutive duplicates from the sequence. Defaults to False. @type remove_duplicates: bool @param concatenate_classes: If True, concatenates consecutive classes based on the space character. Defaults to False. @type concatenate_classes: bool @return: A Classification message with attributes `classes` and `scores`, where `classes` is a list of class names and `scores` is a list of corresponding scores. @rtype: Classifications @raises ValueError: If 'classes' is not a list of strings. @raises ValueError: If 'scores' is not a 2D array of list of shape (sequence_length, n_classes). @raises ValueError: If the number of classes does not match the number of columns in 'scores'. @raises ValueError: If any score is not in the range [0, 1]. @raises ValueError: If the probabilities in any row of 'scores' do not sum to 1. @raises ValueError: If 'ignored_indexes' in not None or a list of valid indexes within the range [0, n_classes - 1].
function
create_cluster_message(clusters: List
[
List
[
List
[
Union
[
float
,
int
]
]
]
]) -> Clusters: Clusters
Create a DepthAI message for clusters. @param clusters: List of clusters. Each cluster is a list of points with x and y coordinates. @type clusters: List[List[List[Union[float, int]]]] @return: Clusters message containing the detected clusters. @rtype: Clusters @raise TypeError: If the clusters are not a list. @raise TypeError: If each cluster is not a list. @raise TypeError: If each point is not a list. @raise TypeError: If each value in the point is not an int or float.
function
create_detection_message(bboxes: np.ndarray, scores: np.ndarray, angles: np.ndarray = None, labels: np.ndarray = None, label_names: Optional
[
List
[
str
]
] = None, keypoints: np.ndarray = None, keypoints_scores: np.ndarray = None, masks: np.ndarray = None) -> ImgDetectionsExtended: ImgDetectionsExtended
Create a DepthAI message for object detection. The message contains the bounding boxes in X_center, Y_center, Width, Height format with optional angles, labels and detected object keypoints and masks. @param bbox: Bounding boxes of detected objects in (x_center, y_center, width, height) format. @type bbox: np.ndarray @param scores: Confidence scores of the detected objects of shape (N,). @type scores: np.ndarray @param angles: Angles of detected objects expressed in degrees. Defaults to None. @type angles: Optional[np.ndarray] @param labels: Labels of detected objects of shape (N,). Defaults to None. @type labels: Optional[np.ndarray] @param label_names: Names of the labels (classes) @type label_names: Optional[List[str]] @param keypoints: Keypoints of detected objects of shape (N, n_keypoints, dim) where dim is 2 or 3. Defaults to None. @type keypoints: Optional[np.array] @param keypoints_scores: Confidence scores of detected keypoints of shape (N, n_keypoints, 1). Defaults to None. @type keypoints_scores: Optional[np.ndarray] @param masks: Masks of detected objects of shape (H, W). Defaults to None. @type masks: Optional[np.ndarray] @return: Message containing the bounding boxes, labels, confidence scores, and keypoints of detected objects. @rtype: ImgDetectionsExtended @raise ValueError: If the bboxes are not a numpy array. @raise ValueError: If the bboxes are not of shape (N,4). @raise ValueError: If the scores are not a numpy array. @raise ValueError: If the scores are not of shape (N,). @raise ValueError: If the scores do not have the same length as bboxes. @raise ValueError: If the angles do not have the same length as bboxes. @raise ValueError: If the angles are not between -360 and 360. @raise ValueError: If the labels are not a list of integers. @raise ValueError: If the labels do not have the same length as bboxes. @raise ValueError: If the keypoints are not a numpy array of shape (N, M, 2 or 3). @raise ValueError: If the masks are not a 3D numpy array of shape (img_height, img_width, N) or (N, img_height, img_width). @raise ValueError: If the keypoints scores are not a numpy array. @raise ValueError: If the keypoints scores are not of shape [n_detections, n_keypoints, 1]. @raise ValueError: If the keypoints scores do not have the same length as keypoints. @raise ValueError: If the keypoints scores are not between 0 and 1.
function
create_image_message(image: np.ndarray, is_bgr: bool = True, img_frame_type: dai.ImgFrame.Type = dai.ImgFrame.Type.BGR888i) -> dai.ImgFrame: dai.ImgFrame
Create a DepthAI message for an image array. @param image: Image array in HWC or CHW format. @type image: np.array @param is_bgr: If True, the image is in BGR format. If False, the image is in RGB format. Defaults to True. @type is_bgr: bool @param img_frame_type: Output ImgFrame type. Defaults to BGR888i. @type img_frame_type: dai.ImgFrame.Type @return: dai.ImgFrame object containing the image information. @rtype: dai.ImgFrame @raise ValueError: If the image shape is not CHW or HWC.
function
create_keypoints_message(keypoints: Union
[
np.ndarray
,
List
[
List
[
float
]
]
], scores: Union
[
np.ndarray
,
List
[
float
]
,
None
] = None, confidence_threshold: Optional
[
float
] = None) -> Keypoints: Keypoints
Create a DepthAI message for the keypoints. @param keypoints: Detected 2D or 3D keypoints of shape (N,2 or 3) meaning [...,[x, y],...] or [...,[x, y, z],...]. @type keypoints: np.ndarray or List[List[float]] @param scores: Confidence scores of the detected keypoints. Defaults to None. @type scores: Union[np.ndarray, List[float], None] @param confidence_threshold: Confidence threshold of keypoint detections. Defaults to None. @type confidence_threshold: Optional[float] @return: Keypoints message containing the detected keypoints. @rtype: Keypoints @raise ValueError: If the keypoints are not a numpy array or list. @raise ValueError: If the scores are not a numpy array or list. @raise ValueError: If scores and keypoints do not have the same length. @raise ValueError: If score values are not floats. @raise ValueError: If score values are not between 0 and 1. @raise ValueError: If the confidence threshold is not a float. @raise ValueError: If the confidence threshold is not between 0 and 1. @raise ValueError: If the keypoints are not of shape (N,2 or 3). @raise ValueError: If the keypoints 2nd dimension is not of size E{2} or E{3}.
function
create_line_detection_message(lines: np.ndarray, scores: np.ndarray)
Create a DepthAI message for a line detection. @param lines: Detected lines of shape (N,4) meaning [...,[x_start, y_start, x_end, y_end],...]. @type lines: np.ndarray @param scores: Confidence scores of detected lines of shape (N,). @type scores: np.ndarray @return: Message containing the lines and confidence scores of detected lines. @rtype: Lines @raise ValueError: If the lines are not a numpy array. @raise ValueError: If the lines are not of shape (N,4). @raise ValueError: If the lines 2nd dimension is not of size E{4}. @raise ValueError: If the scores are not a numpy array. @raise ValueError: If the scores are not of shape (N,). @raise ValueError: If the scores do not have the same length as lines.
function
create_map_message(map: np.ndarray, min_max_scaling: bool = False) -> Map2D: Map2D
Create a DepthAI message for a map of floats. @param map: A NumPy array representing the map with shape HW or NHW/HWN. Here N stands for batch dimension. @type map: np.array @param min_max_scaling: If True, the map is scaled to the range [0, 1]. Defaults to False. @type min_max_scaling: bool @return: An Map2D object containing the density information. @rtype: Map2D @raise ValueError: If the density map is not a NumPy array. @raise ValueError: If the density map is not 2D or 3D. @raise ValueError: If the 3D density map shape is not NHW or HWN.
function
create_regression_message(predictions: List
[
float
]) -> Predictions: Predictions
Create a DepthAI message for prediction models. @param predictions: Predicted value(s). @type predictions: List[float] @return: Predictions message containing the predicted value(s). @rtype: Predictions @raise ValueError: If predictions is not a list. @raise ValueError: If each prediction is not a float.
function
create_segmentation_message(mask: np.ndarray) -> SegmentationMask: SegmentationMask
Create a DepthAI message for segmentation mask. @param mask: Segmentation map array of shape (H, W) where each value represents a segmented object class. @type mask: np.array @return: Segmentation mask message. @rtype: SegmentationMask @raise ValueError: If mask is not a numpy array. @raise ValueError: If mask is not 2D. @raise ValueError: If mask is not of type int16.
function
create_tracked_features_message(reference_points: np.ndarray, target_points: np.ndarray) -> dai.TrackedFeatures: dai.TrackedFeatures
Create a DepthAI message for tracked features. @param reference_points: Reference points of shape (N,2) meaning [...,[x, y],...]. @type reference_points: np.ndarray @param target_points: Target points of shape (N,2) meaning [...,[x, y],...]. @type target_points: np.ndarray @return: Message containing the tracked features. @rtype: dai.TrackedFeatures @raise ValueError: If the reference_points are not a numpy array. @raise ValueError: If the reference_points are not of shape (N,2). @raise ValueError: If the reference_points 2nd dimension is not of size E{2}. @raise ValueError: If the target_points are not a numpy array. @raise ValueError: If the target_points are not of shape (N,2). @raise ValueError: If the target_points 2nd dimension is not of size E{2}.
module
depthai_nodes.message.creators.tracked_features
function
create_feature_point(x: float, y: float, id: int, age: int) -> dai.TrackedFeature: dai.TrackedFeature
Create a tracked feature point. @param x: X coordinate of the feature point. @type x: float @param y: Y coordinate of the feature point. @type y: float @param id: ID of the feature point. @type id: int @param age: Age of the feature point. @type age: int @return: Tracked feature point. @rtype: dai.TrackedFeature
class
depthai_nodes.message.Classifications(depthai.Buffer)
method
__init__(self)
Initializes the Classifications object.
property
classes
Returns the list of classes. @return: List of classes. @rtype: List[str]
method
classes.setter(self, value: List
[
str
])
Sets the classes. @param value: A list of class names. @type value: List[str] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type string.
property
scores
Returns the list of scores. @return: List of scores. @rtype: NDArray[np.float32]
method
scores.setter(self, value: NDArray
[
np.float32
])
Sets the scores. @param value: A list of scores. @type value: NDArray[np.float32] @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 1D numpy array. @raise ValueError: If each element is not of type float.
property
top_class
Returns the most probable class. Only works if classes are sorted by scores. @return: The top class. @rtype: str
property
top_score
Returns the probability of the most probable class. Only works if scores are sorted by descending order. @return: The top score. @rtype: float
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations
Returns default visualization message for classification. The message adds the top five classes and their scores to the right side of the image.
class
depthai_nodes.message.Cluster(depthai.Buffer)
method
__init__(self)
Initializes the Cluster object.
property
label
Returns the label of the cluster. @return: Label of the cluster. @rtype: int
method
label.setter(self, value: int)
Sets the label of the cluster. @param value: Label of the cluster. @type value: int @raise TypeError: If value is not an int.
property
points
Returns the points in the cluster. @return: List of points in the cluster. @rtype: List[dai.Point2f]
method
points.setter(self, value: List
[
dai.Point2f
])
Sets the points in the cluster. @param value: List of points in the cluster. @type value: List[dai.Point2f] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type dai.Point2f.
class
depthai_nodes.message.Clusters(depthai.Buffer)
method
__init__(self)
Initializes the Clusters object.
property
clusters
Returns the clusters. @return: List of clusters. @rtype: List[Cluster]
method
clusters.setter(self, value: List
[
Cluster
])
Sets the clusters. @param value: List of clusters. @type value: List[Cluster] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type Cluster.
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations
Creates a default visualization message for clusters and colors each one separately.
class
depthai_nodes.message.ImgDetectionExtended(depthai.Buffer)
method
__init__(self)
Initializes the ImgDetectionExtended object.
method
copy(self)
Creates a new instance of the ImgDetectionExtended class and copies the attributes. @return: A new instance of the ImgDetectionExtended class. @rtype: ImgDetectionExtended
property
rotated_rect
Returns the rotated rectangle representing the bounding box. @return: Rotated rectangle object @rtype: dai.RotatedRect
method
rotated_rect.setter(self, rectangle: Tuple
[
float
,
float
,
float
,
float
,
float
])
Sets the rotated rectangle of the bounding box. @param value: Tuple of (x_center, y_center, width, height, angle). @type value: tuple[float, float, float, float, float]
property
confidence
Returns the confidence of the detection. @return: Confidence of the detection. @rtype: float
method
confidence.setter(self, value: float)
Sets the confidence of the detection. @param value: Confidence of the detection. @type value: float @raise TypeError: If value is not a float. @raise ValueError: If value is not between 0 and 1.
property
label
Returns the label of the detection. @return: Label of the detection. @rtype: int
method
label.setter(self, value: int)
Sets the label of the detection. @param value: Label of the detection. @type value: int @raise TypeError: If value is not an integer.
property
label_name
Returns the label name of the detection. @return: Label name of the detection. @rtype: str
method
label_name.setter(self, value: str)
Sets the label name of the detection. @param value: Label name of the detection. @type value: str @raise TypeError: If value is not a string.
property
keypoints
Returns the keypoints. @return: List of keypoints. @rtype: Keypoints
method
keypoints.setter(self, value: List
[
Keypoint
])
Sets the keypoints. @param value: List of keypoints. @type value: List[Keypoint] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type Keypoint.
class
depthai_nodes.message.ImgDetectionsExtended(depthai.Buffer)
method
__init__(self)
Initializes the ImgDetectionsExtended object.
method
copy(self)
Creates a new instance of the ImgDetectionsExtended class and copies the attributes. @return: A new instance of the ImgDetectionsExtended class. @rtype: ImgDetectionsExtended
property
detections
Returns the image detections with keypoints. @return: List of image detections with keypoints. @rtype: List[ImgDetectionExtended]
method
detections.setter(self, value: List
[
ImgDetectionExtended
])
Sets the image detections with keypoints. @param value: List of image detections with keypoints. @type value: List[ImgDetectionExtended] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type ImgDetectionExtended.
property
masks
Returns the segmentation masks stored in a single numpy array. @return: Segmentation masks. @rtype: SegmentationMask
method
masks.setter(self, value: Union
[
NDArray
[
np.int16
]
,
SegmentationMask
])
Sets the segmentation mask. @param value: Segmentation mask. @type value: NDArray[np.int8]) @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 2D numpy array. @raise ValueError: If each element is not of type int8. @raise ValueError: If each element is larger or equal to -1.
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getTransformation(self) -> dai.ImgTransformation: dai.ImgTransformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
class
depthai_nodes.message.Keypoint(depthai.Buffer)
method
__init__(self)
Initializes the Keypoint object.
property
x
Returns the X coordinate of the keypoint. @return: X coordinate of the keypoint. @rtype: float
method
x.setter(self, value: float)
Sets the X coordinate of the keypoint. @param value: X coordinate of the keypoint. @type value: float @raise TypeError: If value is not a float. @raise ValueError: If value is not between 0 and 1.
property
y
Returns the Y coordinate of the keypoint. @return: Y coordinate of the keypoint. @rtype: float
method
y.setter(self, value: float)
Sets the Y coordinate of the keypoint. @param value: Y coordinate of the keypoint. @type value: float @raise TypeError: If value is not a float. @raise ValueError: If value is not between 0 and 1.
property
z
Returns the Z coordinate of the keypoint. @return: Z coordinate of the keypoint. @rtype: float
method
z.setter(self, value: float)
Sets the Z coordinate of the keypoint. @param value: Z coordinate of the keypoint. @type value: float @raise TypeError: If value is not a float.
property
confidence
Returns the confidence of the keypoint. @return: Confidence of the keypoint. @rtype: float
method
confidence.setter(self, value: float)
Sets the confidence of the keypoint. @param value: Confidence of the keypoint. @type value: float @raise TypeError: If value is not a float. @raise ValueError: If value is not between 0 and 1.
class
depthai_nodes.message.Keypoints(depthai.Buffer)
method
__init__(self)
Initializes the Keypoints object.
property
keypoints
Returns the keypoints. @return: List of keypoints. @rtype: List[Keypoint]
method
keypoints.setter(self, value: List
[
Keypoint
])
Sets the keypoints. @param value: List of keypoints. @type value: List[Keypoint] @raise TypeError: If value is not a list. @raise TypeError: If each each element is not of type Keypoint.
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getPoints2f(self) -> dai.VectorPoint2f: dai.VectorPoint2f
Returns the keypoints in the form of a dai.VectorPoint2f object.
method
getPoints3f(self) -> List[dai.Point3f]: List[dai.Point3f]
Returns the keypoints in the form of a list of dai.Point3f objects.
method
getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations
Creates a default visualization message for the keypoints.
class
depthai_nodes.message.Line(depthai.Buffer)
method
__init__(self)
Initializes the Line object.
property
start_point
Returns the start point of the line. @return: Start point of the line. @rtype: dai.Point2f
method
start_point.setter(self, value: dai.Point2f)
Sets the start point of the line. @param value: Start point of the line. @type value: dai.Point2f @raise TypeError: If value is not of type dai.Point2f.
property
end_point
Returns the end point of the line. @return: End point of the line. @rtype: dai.Point2f
method
end_point.setter(self, value: dai.Point2f)
Sets the end point of the line. @param value: End point of the line. @type value: dai.Point2f @raise TypeError: If value is not of type dai.Point2f.
property
confidence
Returns the confidence of the line. @return: Confidence of the line. @rtype: float
method
confidence.setter(self, value: float)
Sets the confidence of the line. @param value: Confidence of the line. @type value: float @raise TypeError: If value is not of type float.
class
depthai_nodes.message.Lines(depthai.Buffer)
method
__init__(self)
Initializes the Lines object.
property
lines
Returns the lines. @return: List of lines. @rtype: List[Line]
method
lines.setter(self, value: List
[
Line
])
Sets the lines. @param value: List of lines. @type value: List[Line] @raise TypeError: If value is not a list. @raise TypeError: If each element is not of type Line.
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations
Returns default visualization message for lines. The message adds lines to the image.
class
depthai_nodes.message.Map2D(depthai.Buffer)
method
__init__(self)
Initializes the Map2D object.
property
map
Returns the 2D map. @return: 2D map. @rtype: NDArray[np.float32]
method
map.setter(self, value: np.ndarray)
Sets the 2D map. @param value: 2D map. @type value: NDArray[np.float32] @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 2D numpy array. @raise ValueError: If each element is not of type float.
property
width
Returns the 2D map width. @return: 2D map width. @rtype: int
property
height
Returns the 2D map height. @return: 2D map height. @rtype: int
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgFrame: dai.ImgFrame
Returns default visualization message for 2D maps in the form of a colormapped image.
class
depthai_nodes.message.Prediction(depthai.Buffer)
method
__init__(self)
Initializes the Prediction object.
property
prediction
Returns the prediction. @return: The predicted value. @rtype: float
method
prediction.setter(self, value: float)
Sets the prediction. @param value: The predicted value. @type value: float @raise TypeError: If value is not of type float.
class
depthai_nodes.message.Predictions(depthai.Buffer)
method
__init__(self)
Initializes the Predictions object.
property
predictions
Returns the predictions. @return: List of predictions. @rtype: List[Prediction]
method
predictions.setter(self, value: List
[
Prediction
])
Sets the predictions. @param value: List of predicted values. @type value: List[Prediction] @raise TypeError: If value is not a list. @raise ValueError: If each element is not of type Prediction.
property
prediction
Returns the first prediction. Useful for single predictions. @return: The predicted value. @rtype: float
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgAnnotations: dai.ImgAnnotations
Returns the visualization message for the predictions. The message adds text representing the predictions to the right of the image.
class
depthai_nodes.message.SegmentationMask(depthai.Buffer)
method
__init__(self)
Initializes the SegmentationMask object.
method
copy(self)
Creates a new instance of the SegmentationMask class and copies the attributes. @return: A new instance of the SegmentationMask class. @rtype: SegmentationMask
property
mask
Returns the segmentation mask. @return: Segmentation mask. @rtype: NDArray[np.int16]
method
mask.setter(self, value: NDArray
[
np.int16
])
Sets the segmentation mask. @param value: Segmentation mask. @type value: NDArray[np.int16]) @raise TypeError: If value is not a numpy array. @raise ValueError: If value is not a 2D numpy array. @raise ValueError: If each element is not of type int16. @raise ValueError: If any element is smaller than -1.
variable
transformation
Returns the Image Transformation object. @return: The Image Transformation object. @rtype: dai.ImgTransformation
method
transformation.setter(self, value: dai.ImgTransformation)
Sets the Image Transformation object. @param value: The Image Transformation object. @type value: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
setTransformation(self, transformation: dai.ImgTransformation)
Sets the Image Transformation object. @param transformation: The Image Transformation object. @type transformation: dai.ImgTransformation @raise TypeError: If value is not a dai.ImgTransformation object.
method
getVisualizationMessage(self) -> dai.ImgFrame: dai.ImgFrame
Returns the default visualization message for segmentation masks.
package
depthai_nodes.node
module
module
module
module
module
module
module
package
module
module
module
package
class
ApplyColormap
A host node that applies a colormap to a given 2D array (e.g. depth maps, segmentation masks, heatmaps, etc.). Attributes ---------- colormap_value : Optional[int] OpenCV colormap enum value. Determines the applied color mapping. max_value : Optional[int] Maximum value to consider for normalization. If set lower than the map's actual maximum, the map's maximum will be used instead. instance_to_semantic_mask : Optional[bool] If True, converts instance segmentation masks to semantic segmentation masks. Note that this is only relevant for ImgDetectionsExtended messages. arr : dai.ImgFrame or Map2D or ImgDetectionsExtended The input message with a 2D array. output : dai.ImgFrame The output message for a colorized frame.
class
DepthMerger
DepthMerger is a custom host node for merging 2D detections with depth information to produce spatial detections. Attributes ---------- output : dai.Node.Output The output of the DepthMerger node containing dai.SpatialImgDetections. shrinking_factor : float The shrinking factor for the bounding box. 0 means no shrinking. The factor means the percentage of the bounding box to shrink from each side. Usage ----- depth_merger = pipeline.create(DepthMerger).build( output_2d=nn.out, output_depth=stereo.depth )
class
HostSpatialsCalc
HostSpatialsCalc is a helper class for calculating spatial coordinates from depth data. Attributes ---------- calibData : dai.CalibrationHandler Calibration data handler for the device. depth_alignment_socket : dai.CameraBoardSocket The camera socket used for depth alignment. delta : int The delta value for ROI calculation. Default is 5 - means 10x10 depth pixels around point for depth averaging. thresh_low : int The lower threshold for depth values. Default is 200 - means 20cm. thresh_high : int The upper threshold for depth values. Default is 30000 - means 30m.
class
ImgDetectionsBridge
Transforms the dai.ImgDetections to ImgDetectionsExtended object or vice versa. Note that conversion from ImgDetectionsExtended to ImgDetection loses information about segmentation, keypoints and rotation. Attributes ---------- input : dai.ImgDetections or ImgDetectionsExtended The input message for the ImgDetections object. output : dai.ImgDetections or ImgDetectionsExtended The output message of the transformed ImgDetections object.
class
ImgDetectionsFilter
Filters out detections based on the specified criteria and outputs them as a separate message. The order of operations: 1. Filter by label/confidence; 2. Sort (if applicable); 3. Subset. Attributes ---------- labels_to_keep : List[int] Labels to keep. Only detections with labels in this list will be kept. labels_to_reject: List[int] Labels to reject. Only detections with labels not in this list will be kept. confidence_threshold : float Minimum confidence threshold. Detections with confidence below this threshold will be filtered out. max_detections : int Maximum number of detections to keep. If not defined, all detections will be kept. sort_by_confidence: bool Whether to sort the detections by confidence before subsetting. If True, the detections will be sorted in descending order of confidence. It's set to False by default.
class
ImgFrameOverlay
A host node that receives two dai.ImgFrame objects and overlays them into a single one. Attributes ---------- frame1 : dai.ImgFrame The input message for the background frame. frame2 : dai.ImgFrame The input message for the foreground frame. alpha: float The weight of the background frame in the overlay. By default, the weight is 0.5 which means that both frames are represented equally in the overlay. out : dai.ImgFrame The output message for the overlay frame.
class
ParserGenerator
General interface for instantiating parsers based on the provided model archive. The `build` method creates parsers based on the head information stored in the NN Archive. The method then returns a dictionary of these parsers.
class
BaseParser
Base class for neural network output parsers. This class serves as a foundation for specific parser implementations used to postprocess the outputs of neural network models. Each parser is attached to a model "head" that governs the parsing process as it contains all the necessary information for the parser to function correctly. Subclasses should implement `build` method to correctly set all parameters of the parser and the `run` method to define the parsing logic. Attributes ---------- input : Node.Input Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. out : Node.Output Parser sends the processed network results to this output in a form of DepthAI message. It is a linking point from which the processed network results are retrieved.
class
ClassificationParser
Postprocessing logic for Classification model. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. classes : List[str] List of class names to be used for linking with their respective scores. Expected to be in the same order as Neural Network's output. If not provided, the message will only return sorted scores. is_softmax : bool = True If False, the scores are converted to probabilities using softmax function. Output Message/s ---------------- **Type** : Classifications(dai.Buffer) **Description**: An object with attributes `classes` and `scores`. `classes` is a list of classes, sorted in descending order of scores. `scores` is a list of corresponding scores.
class
ClassificationSequenceParser
Postprocessing logic for a classification sequence model. The model predicts the classes multiple times and returns a list of predicted classes, where each item corresponds to the relative step in the sequence. In addition to time series classification, this parser can also be used for text recognition models where words can be interpreted as a sequence of characters (classes). Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. classes: List[str] List of available classes for the model. is_softmax: bool If False, the scores are converted to probabilities using softmax function. ignored_indexes: List[int] List of indexes to ignore during classification generation (e.g., background class, blank space). remove_duplicates: bool If True, removes consecutive duplicates from the sequence. concatenate_classes: bool If True, concatenates consecutive words based on the predicted spaces. Output Message/s ---------------- **Type**: Classifications(dai.Buffer) **Description**: An object with attributes `classes` and `scores`. `classes` is a list containing the predicted classes. `scores` is a list of corresponding probability scores.
class
DetectionParser
Parser class for parsing the output of a "general" detection model. The parser expects the output of the model to have two tensors: one for bounding boxes and one for scores. Tensor for bboxes should be of shape (N, 4) and scores should be of shape (N,). Bboxes are expected to be in the format [xmin, ymin, xmax, ymax]. If this is not the case you can check other parsers or create a new one. As the result, the node sends out the detected objects in the form of a message containing bounding boxes and confidence scores. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. conf_threshold : float Confidence score threshold of detected bounding boxes. iou_threshold : float Non-maximum suppression threshold. max_det : int Maximum number of detections to keep. label_names : List[str] List of label names for detected objects. Output Message/s ------- **Type**: ImgDetectionsExtended **Description**: ImgDetectionsExtended message containing bounding boxes and confidence scores of detected objects. ----------------
class
EmbeddingsParser
Parser class for parsing the output of embeddings neural network model head. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. Output Message/s ---------------- **Type**: dai.NNData **Description**: The output layer of the neural network model head.
class
FastSAMParser
Parser class for parsing the output of the FastSAM model. Attributes ---------- conf_threshold : float Confidence score threshold for detected faces. n_classes : int Number of classes in the model. iou_threshold : float Non-maximum suppression threshold. mask_conf : float Mask confidence threshold. prompt : str Prompt type. points : Tuple[int, int] Points. point_label : int Point label. bbox : Tuple[int, int, int, int] Bounding box. yolo_outputs : List[str] Names of the YOLO outputs. mask_outputs : List[str] Names of the mask outputs. protos_output : str Name of the protos output. Output Message/s ---------------- **Type**: SegmentationMask **Description**: SegmentationMask message containing the resulting segmentation masks given the prompt. Error Handling --------------
class
HRNetParser
Parser class for parsing the output of the HRNet pose estimation model. The code is inspired by https://github.com/ibaiGorordo/ONNX-HRNET-Human-Pose-Estimation. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. score_threshold : float Confidence score threshold for detected keypoints. Output Message/s ---------------- **Type**: Keypoints **Description**: Keypoints message containing detected body keypoints.
class
ImageOutputParser
Parser class for image-to-image models (e.g. DnCNN3, zero-dce etc.) where the output is a modifed image (denoised, enhanced etc.). Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. output_is_bgr : bool Flag indicating if the output image is in BGR (Blue-Green-Red) format. Output Message/s ------- **Type**: dai.ImgFrame **Description**: Image message containing the output image e.g. denoised or enhanced images. Error Handling -------------- **ValueError**: If the output is not 3- or 4-dimensional. **ValueError**: If the number of output layers is not 1.
class
KeypointParser
Parser class for 2D or 3D keypoints models. It expects one ouput layer containing keypoints. The number of keypoints must be specified. Moreover, the keypoints are normalized by a scale factor if provided. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. scale_factor : float Scale factor to divide the keypoints by. n_keypoints : int Number of keypoints the model detects. score_threshold : float Confidence score threshold for detected keypoints. Output Message/s ---------------- **Type**: Keypoints **Description**: Keypoints message containing 2D or 3D keypoints. Error Handling -------------- **ValueError**: If the number of keypoints is not specified. **ValueError**: If the number of coordinates per keypoint is not 2 or 3. **ValueError**: If the number of output layers is not 1.
class
LaneDetectionParser
Parser class for Ultra-Fast-Lane-Detection model. It expects one ouput layer containing the lane detection results. It supports two versions of the model: CuLane and TuSimple. Results are representented with clusters of points. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. row_anchors : List[int] List of row anchors. griding_num : int Griding number. cls_num_per_lane : int Number of points per lane. input_size : Tuple[int, int] Input size (width,height). Output Message/s ---------------- **Type**: Clusters **Description**: Detected lanes represented as clusters of points. Error Handling -------------- **ValueError**: If the row anchors are not specified. **ValueError**: If the griding number is not specified. **ValueError**: If the number of points per lane is not specified.
class
MapOutputParser
A parser class for models that produce map outputs, such as depth maps (e.g. DepthAnything), density maps (e.g. DM-Count), heat maps, and similar. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. min_max_scaling : bool If True, the map is scaled to the range [0, 1]. Output Message/s ---------------- **Type**: Map2D **Description**: Density message containing the density map. The density map is represented with Map2D object.
class
MPPalmDetectionParser
Parser class for parsing the output of the Mediapipe Palm detection model. As the result, the node sends out the detected hands in the form of a message containing bounding boxes, labels, and confidence scores. Attributes ---------- output_layer_names: List[str] Names of the output layers relevant to the parser. conf_threshold : float Confidence score threshold for detected hands. iou_threshold : float Non-maximum suppression threshold. max_det : int Maximum number of detections to keep. scale : int Scale of the input image. Output Message/s ------- **Type**: ImgDetectionsExtended **Description**: ImgDetectionsExtended message containing bounding boxes, labels, and confidence scores of detected hands. See also -------- Official MediaPipe Hands solution: https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker
class
MLSDParser
Parser class for parsing the output of the M-LSD line detection model. The parser is specifically designed to parse the output of the M-LSD model. As the result, the node sends out the detected lines in the form of a message. Attributes ---------- output_layer_tpmap : str Name of the output layer containing the tpMap tensor. output_layer_heat : str Name of the output layer containing the heat tensor. topk_n : int Number of top candidates to keep. score_thr : float Confidence score threshold for detected lines. dist_thr : float Distance threshold for merging lines. Output Message/s ---------------- **Type**: LineDetections **Description**: LineDetections message containing detected lines and confidence scores.
class
PPTextDetectionParser
Parser class for parsing the output of the PaddlePaddle OCR text detection model. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. conf_threshold : float The threshold for bounding boxes. mask_threshold : float The threshold for the mask. max_det : int The maximum number of candidate bounding boxes. Output Message/s ------- **Type**: dai.ImgDetections **Description**: ImgDetections message containing bounding boxes and the respective confidence scores of detected text.
class
RegressionParser
Parser class for parsing the output of a model with regression output (e.g. Age- Gender). Attributes ---------- output_layer_name : str Name of the output layer relevant to the parser. Output Message/s ---------------- **Type**: Predictions **Description**: Message containing the prediction(s).
class
SCRFDParser
Parser class for parsing the output of the SCRFD face detection model. Attributes ---------- output_layer_name: List[str] Names of the output layers relevant to the parser. conf_threshold : float Confidence score threshold for detected faces. iou_threshold : float Non-maximum suppression threshold. max_det : int Maximum number of detections to keep. input_size : tuple Input size of the model. feat_stride_fpn : tuple Tuple of the feature strides. num_anchors : int Number of anchors. Output Message/s ---------------- **Type**: dai.ImgDetections **Description**: ImgDetections message containing bounding boxes, labels, and confidence scores of detected faces.
class
SegmentationParser
Parser class for parsing the output of the segmentation models. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. classes_in_one_layer : bool Whether all classes are in one layer in the multi-class segmentation model. Default is False. If True, the parser will use np.max instead of np.argmax to get the class map. Output Message/s ---------------- **Type**: dai.ImgFrame **Description**: Segmentation message containing the segmentation mask. Every pixel belongs to exactly one class. Unassigned pixels are represented with "-1" and class pixels with non-negative integers. Error Handling -------------- **ValueError**: If the number of output layers is not E{1}. **ValueError**: If the number of dimensions of the output tensor is not E{3}.
class
SuperAnimalParser
Parser class for parsing the output of the SuperAnimal landmark model. Attributes ---------- output_layer_name: str Name of the output layer relevant to the parser. scale_factor : float Scale factor to divide the keypoints by. n_keypoints : int Number of keypoints. score_threshold : float Confidence score threshold for detected keypoints. Output Message/s ---------------- **Type**: Keypoints **Description**: Keypoints message containing detected keypoints that exceeds confidence threshold.
class
XFeatMonoParser
Parser class for parsing the output of the XFeat model. It can be used for parsing the output from one source (e.g. one camera). The reference frame can be set with trigger method. Attributes ---------- output_layer_feats : str Name of the output layer containing features. output_layer_keypoints : str Name of the output layer containing keypoints. output_layer_heatmaps : str Name of the output layer containing heatmaps. original_size : Tuple[float, float] Original image size. input_size : Tuple[float, float] Input image size. max_keypoints : int Maximum number of keypoints to keep. previous_results : np.ndarray Previous results from the model. Previous results are used to match keypoints between two frames. trigger : bool Trigger to set the reference frame. Output Message/s ---------------- **Type**: dai.TrackedFeatures **Description**: TrackedFeatures message containing matched keypoints with the same ID. Error Handling -------------- **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.
class
XFeatStereoParser
Parser class for parsing the output of the XFeat model. It can be used for parsing the output from two sources (e.g. two cameras - left and right). Attributes ---------- reference_input : Node.Input Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. target_input : Node.Input Node's input. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. out : Node.Output Parser sends the processed network results to this output in a form of DepthAI message. It is a linking point from which the processed network results are retrieved. output_layer_feats : str Name of the output layer from which the features are extracted. output_layer_keypoints : str Name of the output layer from which the keypoints are extracted. output_layer_heatmaps : str Name of the output layer from which the heatmaps are extracted. original_size : Tuple[float, float] Original image size. input_size : Tuple[float, float] Input image size. max_keypoints : int Maximum number of keypoints to keep. Output Message/s ---------------- **Type**: dai.TrackedFeatures **Description**: TrackedFeatures message containing matched keypoints with the same ID. Error Handling -------------- **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.
class
YOLOExtendedParser
Parser class for parsing the output of the YOLO Instance Segmentation and Pose Estimation models. Attributes ---------- conf_threshold : float Confidence score threshold for detected faces. n_classes : int Number of classes in the model. label_names : Optional[List[str]] Names of the classes. iou_threshold : float Intersection over union threshold. mask_conf : float Mask confidence threshold. n_keypoints : int Number of keypoints in the model. anchors : Optional[List[List[List[float]]]] Anchors for the YOLO model (optional). subtype : str Version of the YOLO model. Output Message/s ---------------- **Type**: ImgDetectionsExtended **Description**: Message containing bounding boxes, labels, label names, confidence scores, and keypoints or masks and protos of the detected objects.
class
YuNetParser
Parser class for parsing the output of the YuNet face detection model. Attributes ---------- conf_threshold : float Confidence score threshold for detected faces. iou_threshold : float Non-maximum suppression threshold. max_det : int Maximum number of detections to keep. input_size : Tuple[int, int] Input size (width, height). loc_output_layer_name: str Name of the output layer containing the location predictions. conf_output_layer_name: str Name of the output layer containing the confidence predictions. iou_output_layer_name: str Name of the output layer containing the IoU predictions. Output Message/s ---------------- **Type**: ImgDetectionsExtended **Description**: Message containing bounding boxes, labels, confidence scores, and keypoints of detected faces.
class
class
TilesPatcher
Handles the processing of tiled frames from neural network (NN) outputs, maps the detections from tiles back into the global frame, and sends out the combined detections for further processing. @ivar conf_thresh: Confidence threshold for filtering detections. @type conf_thresh: float @ivar iou_thresh: IOU threshold for non-max suppression. @type iou_thresh: float @ivar tile_manager: Manager responsible for handling tiling configurations. @type tile_manager: Tiling @ivar tile_buffer: Buffer to store tile detections temporarily. @type tile_buffer: list @ivar current_timestamp: Timestamp for the current frame being processed. @type current_timestamp: float @ivar expected_tiles_count: Number of tiles expected per frame. @type expected_tiles_count: int
class
Tiling
Manages tiling of input frames for neural network processing, divides frames into overlapping tiles based on configuration parameters, and creates ImgFrames for each tile to be sent to a neural network node. @ivar overlap: Overlap between adjacent tiles, valid in [0,1). @type overlap: float @ivar grid_size: Grid size (number of tiles horizontally and vertically). @type grid_size: tuple @ivar grid_matrix: The matrix representing the grid of tiles. @type grid_matrix: list @ivar nn_shape: Shape of the neural network input. @type nn_shape: tuple @ivar x: Vector representing the tile's dimensions. @type x: list @ivar tile_positions: Coordinates and scaled sizes of the tiles. @type tile_positions: list @ivar img_shape: Shape of the original input image. @type img_shape: tuple @ivar global_detection: Whether to use global detection. @type global_detection: bool
package
depthai_nodes.node.parsers
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
package
module
module
module
module
depthai_nodes.node.parsers.base_parser
class
package
depthai_nodes.node.parsers.utils
module
module
function
decode_head(head) -> Dict[str, Any]: Dict[str, Any]
Decode head object into a dictionary containing configuration details. @param head: The head object to decode. @type head: dai.nn_archive.v1.Head @return: A dictionary containing configuration details relevant to the head. @rtype: Dict[str, Any]
module
module
module
module
module
medipipe
mediapipe.py. Description: This script contains utility functions for decoding the output of the MediaPipe hand tracking model. This script contains code that is based on or directly taken from a public GitHub repository: https://github.com/geaxgx/depthai_hand_tracker Original code author(s): geaxgx License: MIT License Copyright (c) [2021] [geax]
module
module
module
module
module
module
module
module
module
function
sigmoid(x: np.ndarray) -> np.ndarray: np.ndarray
Sigmoid function. @param x: Input tensor. @type x: np.ndarray @return: A result tensor after applying a sigmoid function on the given input. @rtype: np.ndarray
function
softmax(x: np.ndarray, axis: Optional
[
int
] = None, keep_dims: bool = False) -> np.ndarray: np.ndarray
Compute the softmax of an array. The softmax function is defined as: softmax(x) = exp(x) / sum(exp(x)) @param x: The input array. @type x: np.ndarray @param axis: Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis. @type axis: int @param keep_dims: If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. @type keep_dims: bool @return: The softmax of the input array. @rtype: np.ndarray
function
corners_to_rotated_bbox(corners: np.ndarray) -> np.ndarray: np.ndarray
Converts the corners of a bounding box to a rotated bounding box. @param corners: The corners of the bounding box. The corners are expected to be ordered by top-left, top-right, bottom-right, bottom-left. @type corners: np.ndarray @return: The rotated bounding box defined as [x_center, y_center, width, height, angle]. @rtype: np.ndarray
function
normalize_bboxes(bboxes: np.ndarray, height: int, width: int) -> np.ndarray: np.ndarray
Normalize bounding box coordinates to (0, 1). @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param height: The height of the image. @type height: int @param width: The width of the image. @type width: int @return: A numpy array of shape (N, 4) containing the normalized bounding boxes. @type np.ndarray
function
rotated_bbox_to_corners(cx: float, cy: float, w: float, h: float, rotation: float) -> np.ndarray: np.ndarray
Converts a rotated bounding box to the corners of the bounding box. @param cx: The x-coordinate of the center of the bounding box. @type cx: float @param cy: The y-coordinate of the center of the bounding box. @type cy: float @param w: The width of the bounding box. @type w: float @param h: The height of the bounding box. @type h: float @param rotation: The angle of the bounding box given in degrees. @type rotation: float @return: The corners of the bounding box. @rtype: np.ndarray
function
top_left_wh_to_xywh(bboxes: np.ndarray) -> np.ndarray: np.ndarray
Converts bounding boxes from [top_left_x, top_left_y, width, height] to [x_center, y_center, width, height]. @param bboxes: The bounding boxes to convert. @type bboxes: np.ndarray @return: The converted bounding boxes. @rtype: np.ndarray
function
xywh_to_xyxy(bboxes: np.ndarray) -> np.ndarray: np.ndarray
Convert bounding box coordinates from (x_center, y_center, width, height) to (x_min, y_min, x_max, y_max). @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes in (x, y, width, height) format. @type np.ndarray @return: A numpy array of shape (N, 4) containing the bounding boxes in (x_min, y_min, x_max, y_max) format. @type np.ndarray
function
xyxy_to_xywh(bboxes: np.ndarray) -> np.ndarray: np.ndarray
Converts bounding boxes from [x_min, y_min, x_max, y_max] to [x_center, y_center, width, height]. @param bboxes: The bounding boxes to convert. @type bboxes: np.ndarray @return: The converted bounding boxes. @rtype: np.ndarray
function
unnormalize_image(image, normalize = True)
Un-normalize an image tensor by scaling it to the [0, 255] range. @param image: The normalized image tensor of shape (H, W, C) or (C, H, W). @type image: np.ndarray @param normalize: Whether to normalize the image tensor. Defaults to True. @type normalize: bool @return: The un-normalized image. @rtype: np.ndarray
module
depthai_nodes.node.parsers.utils.fastsam
function
box_prompt(masks: np.ndarray, bbox: Tuple
[
int
,
int
,
int
,
int
], orig_shape: Tuple
[
int
,
int
]) -> np.ndarray: np.ndarray
Modifies the bounding box properties and calculates IoU between masks and bounding box. Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L286 Modified so it uses numpy instead of torch. @param masks: The resulting masks of the FastSAM model @type masks: np.ndarray @param bbox: The prompt bounding box coordinates @type bbox: Tuple[int, int, int, int] @param orig_shape: The original shape of the image @type orig_shape: Tuple[int, int] (height, width) @return: The modified masks @rtype: np.ndarray
function
format_results(bboxes: np.ndarray, masks: np.ndarray, filter: int = 0) -> List[Dict[str, Any]]: List[Dict[str, Any]]
Formats detection results into list of annotations each containing ID, segmentation, bounding box, score and area. Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L56 @param bboxes: The bounding boxes of the detected objects @type bboxes: np.ndarray @param masks: The masks of the detected objects @type masks: np.ndarray @param filter: The filter value @type filter: int @return: The formatted annotations @rtype: List[Dict[str, Any]]
function
point_prompt(bboxes: np.ndarray, masks: np.ndarray, points: List
[
Tuple
[
int
,
int
]
], pointlabel: List
[
int
], orig_shape: Tuple
[
int
,
int
]) -> np.ndarray: np.ndarray
Adjusts points on detected masks based on user input and returns the modified results. Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py#L321 Modified so it uses numpy instead of torch. @param bboxes: The bounding boxes of the detected objects @type bboxes: np.ndarray @param masks: The masks of the detected objects @type masks: np.ndarray @param points: The points to adjust @type points: List[Tuple[int, int]] @param pointlabel: The point labels @type pointlabel: List[int] @param orig_shape: The original shape of the image @type orig_shape: Tuple[int, int] (height, width) @return: The modified masks @rtype: np.ndarray
function
adjust_bboxes_to_image_border(boxes: np.ndarray, image_shape: Tuple
[
int
,
int
], threshold: int = 20) -> np.ndarray: np.ndarray
Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/utils.py#L6 (Ultralytics) Adjust bounding boxes to stick to image border if they are within a certain threshold. @param boxes: Bounding boxes @type boxes: np.ndarray @param image_shape: Image shape @type image_shape: Tuple[int, int] @param threshold: Pixel threshold @type threshold: int @return: Adjusted bounding boxes @rtype: np.ndarray
function
bbox_iou(box1: np.ndarray, boxes: np.ndarray, iou_thres: float = 0.9, image_shape: Tuple
[
int
,
int
] = (640, 640), raw_output: bool = False) -> np.ndarray: np.ndarray
Source: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/utils.py#L30 (Ultralytics - rewritten to numpy) Compute the Intersection-Over-Union of a bounding box with respect to an array of other bounding boxes. @param box1: Array of shape (4, ) representing a single bounding box. @type box1: np.ndarray @param boxes: Array of shape (n, 4) representing multiple bounding boxes. @type boxes: np.ndarray @param iou_thres: IoU threshold @type iou_thres: float @param image_shape: Image shape (height, width) @type image_shape: Tuple[int, int] @param raw_output: If True, return the raw IoU values instead of the indices @type raw_output: bool @return: Indices of boxes with IoU > thres, or the raw IoU values if raw_output is True @rtype: np.ndarray
function
decode_fastsam_output(outputs: List
[
np.ndarray
], strides: List
[
int
], anchors: List
[
Optional
[
np.ndarray
]
], img_shape: Tuple
[
int
,
int
], conf_thres: float = 0.5, iou_thres: float = 0.45, num_classes: int = 1) -> np.ndarray: np.ndarray
Decode the output of the FastSAM model. @param outputs: List of FastSAM outputs @type outputs: List[np.ndarray] @param strides: List of strides @type strides: List[int] @param anchors: List of anchors @type anchors: List[Optional[np.ndarray]] @param img_shape: Image shape @type img_shape: Tuple[int, int] @param conf_thres: Confidence threshold @type conf_thres: float @param iou_thres: IoU threshold @type iou_thres: float @param num_classes: Number of classes @type num_classes: int @return: NMS output @rtype: np.ndarray
function
crop_mask(masks: np.ndarray, box: np.ndarray) -> np.ndarray: np.ndarray
It takes a mask and a bounding box, and returns a mask that is cropped to the bounding box. @param masks: [h, w] array of masks @type masks: np.ndarray @param box: An array of bbox coordinates in (x1, y1, x2, y2) format @type box: np.ndarray @return: The masks are being cropped to the bounding box. @rtype: np.ndarray
function
process_single_mask(protos: np.ndarray, mask_coeff: np.ndarray, mask_conf: float, img_shape: Tuple
[
int
,
int
], bbox: Tuple
[
int
,
int
,
int
,
int
]) -> np.ndarray: np.ndarray
Processes a single mask. @param protos: Prototypes @type protos: np.ndarray @param mask_coeff: Mask coefficients @type mask_coeff: np.ndarray @param mask_conf: Mask confidence @type mask_conf: float @param img_shape: Image shape @type img_shape: Tuple[int, int] @param bbox: Bounding box @type bbox: Tuple[int, int, int, int] @return: Processed mask @rtype: np.ndarray
function
merge_masks(masks: np.ndarray) -> np.ndarray: np.ndarray
Merge masks to a 2D array where each object is represented by a unique label. @param masks: 3D array of masks @type masks: np.ndarray @return: 2D array of masks @rtype: np.ndarray
module
depthai_nodes.node.parsers.utils.keypoints
function
normalize_keypoints(keypoints: np.ndarray, height: int, width: int) -> np.ndarray: np.ndarray
Normalize keypoint coordinates to (0, 1). Parameters: @param keypoints: A numpy array of shape (N, 2) or (N, K, 2) where N is the number of keypoint sets and K is the number of keypoint in each set. @type np.ndarray @param height: The height of the image. @type height: int @param width: The width of the image. @type width: int Returns: np.ndarray: A numpy array of shape (N, 2) containing the normalized keypoints.
module
depthai_nodes.node.parsers.utils.masks_utils
function
crop_mask(mask: np.ndarray, bbox: np.ndarray) -> np.ndarray: np.ndarray
It takes a mask and a bounding box, and returns a mask that is cropped to the bounding box. @param mask: [h, w] numpy array of a single mask @type mask: np.ndarray @param bbox: A numpy array of bbox coordinates in (x_center, y_center, width, height) format @type bbox: np.ndarray @return: A mask that is cropped to the bounding box @rtype: np.ndarray
function
process_single_mask(protos: np.ndarray, mask_coeff: np.ndarray, mask_conf: float, bbox: np.ndarray) -> np.ndarray: np.ndarray
Process a single mask. @param protos: Protos. @type protos: np.ndarray @param mask_coeff: Mask coefficient. @type mask_coeff: np.ndarray @param mask_conf: Mask confidence. @type mask_conf: float @param bbox: A numpy array of bbox coordinates in (x_center, y_center, width, height) normalized format. @type bbox: np.ndarray @return: Processed mask. @rtype: np.ndarray
function
module
depthai_nodes.node.parsers.utils.medipipe
class
HandRegion
Attributes: pd_score : detection score pd_box : detection box [x, y, w, h], normalized [0,1] in the squared image pd_kps : detection keypoints coordinates [x, y], normalized [0,1] in the squared image rect_x_center, rect_y_center : center coordinates of the rotated bounding rectangle, normalized [0,1] in the squared image rect_w, rect_h : width and height of the rotated bounding rectangle, normalized in the squared image (may be > 1) rotation : rotation angle of rotated bounding rectangle with y-axis in radian rect_x_center_a, rect_y_center_a : center coordinates of the rotated bounding rectangle, in pixels in the squared image rect_w, rect_h : width and height of the rotated bounding rectangle, in pixels in the squared image rect_points : list of the 4 points coordinates of the rotated bounding rectangle, in pixels expressed in the squared image during processing, expressed in the source rectangular image when returned to the user
variable
function
function
generate_anchors(options)
option : SSDAnchorOptions # https://github.com/google/mediapipe/blob/master/mediapipe/calculators/tflite/ssd_anchors_calculator.cc
function
function
function
rect_transformation(regions, w, h, no_shift = False)
W, h : image input shape.
function
function
function
function
generate_anchors_and_decode(bboxes, scores, threshold = 0.5, scale = 192)
Generate anchors and decode bounding boxes for mediapipe hand detection model.
class
depthai_nodes.node.parsers.utils.medipipe.HandRegion
method
variable
variable
variable
module
depthai_nodes.node.parsers.utils.mlsd
function
decode_scores_and_points(tpMap: np.ndarray, heat: np.ndarray, topk_n: int) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]
Decode the scores and points from the neural network output tensors. Used for MLSD model. @param tpMap: Tensor containing the vector map. @type tpMap: np.ndarray @param heat: Tensor containing the heat map. @type heat: np.ndarray @param topk_n: Number of top candidates to keep. @type topk_n: int @return: Detected points, confidence scores for the detected points, and vector map. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]
function
get_lines(pts: np.ndarray, pts_score: np.ndarray, vmap: np.ndarray, score_thr: float, dist_thr: float, input_size: int = 512) -> Tuple[np.ndarray, List[float]]: Tuple[np.ndarray, List[float]]
Get lines from the detected points and scores. The lines are filtered by the score threshold and distance threshold. Used for MLSD model. @param pts: Detected points. @type pts: np.ndarray @param pts_score: Confidence scores for the detected points. @type pts_score: np.ndarray @param vmap: Vector map. @type vmap: np.ndarray @param score_thr: Confidence score threshold for detected lines. @type score_thr: float @param dist_thr: Distance threshold for merging lines. @type dist_thr: float @param input_size: Input size of the model. @type input_size: int @return: Detected lines and their confidence scores. @rtype: Tuple[np.ndarray, List[float]]
module
depthai_nodes.node.parsers.utils.nms
function
nms(dets: np.ndarray, nms_thresh: float = 0.5) -> List[int]: List[int]
Non-maximum suppression. @param dets: Bounding boxes and confidence scores. @type dets: np.ndarray @param nms_thresh: Non-maximum suppression threshold. @type nms_thresh: float @return: Indices of the detections to keep. @rtype: List[int]
function
nms_cv2(bboxes: np.ndarray, scores: np.ndarray, conf_threshold: float, iou_threshold: float, max_det: int)
Non-maximum suppression from the opencv-python library. @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type bboxes: np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type scores: np.ndarray @param nms_thresh: Non-maximum suppression threshold. @type nms_thresh: float @return: Indices of the detections to keep. @rtype: List[int]
module
depthai_nodes.node.parsers.utils.ppdet
function
parse_paddle_detection_outputs(predictions: np.ndarray, mask_threshold: float = 0.25, bbox_threshold: float = 0.5, max_detections: int = 100, width: Optional
[
int
] = None, height: Optional
[
int
] = None) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]
Parse the output of a PaddlePaddle Text Detection model from a mask of text probabilities into rotated bounding boxes with additional corners saved as keypoints. @param predictions: The output of a PaddlePaddle Text Detection model. @type predictions: np.ndarray @param mask_threshold: The threshold for the mask. @type mask_threshold: float @param bbox_threshold: The threshold for bounding boxes. @type bbox_threshold: float @param max_detections: The maximum number of candidate bounding boxes. @type max_detections: int @return: A touple containing the rotated bounding boxes, corners and scores. @rtype: Touple[np.ndarray, np.ndarray, np.ndarray]
module
depthai_nodes.node.parsers.utils.scrfd
function
distance2bbox(points, distance, max_shape = None)
Decode distance prediction to bounding box. @param points: Shape (n, 2), [x, y]. @type points: np.ndarray @param distance: Distance from the given point to 4 boundaries (left, top, right, bottom). @type distance: np.ndarray @param max_shape: Shape of the image. @type max_shape: Tuple[int, int] @return: Decoded bboxes. @rtype: np.ndarray
function
distance2kps(points, distance, max_shape = None)
Decode distance prediction to keypoints. @param points: Shape (n, 2), [x, y]. @type points: np.ndarray @param distance: Distance from the given point to 4 boundaries (left, top, right, bottom). @type distance: np.ndarray @param max_shape: Shape of the image. @type max_shape: Tuple[int, int] @return: Decoded keypoints. @rtype: np.ndarray
function
decode_scrfd(bboxes_concatenated, scores_concatenated, kps_concatenated, feat_stride_fpn, input_size, num_anchors, score_threshold, nms_threshold)
Decode the detection results of SCRFD. @param bboxes_concatenated: List of bounding box predictions for each scale. @type bboxes_concatenated: list[np.ndarray] @param scores_concatenated: List of confidence score predictions for each scale. @type scores_concatenated: list[np.ndarray] @param kps_concatenated: List of keypoint predictions for each scale. @type kps_concatenated: list[np.ndarray] @param feat_stride_fpn: List of feature strides for each scale. @type feat_stride_fpn: list[int] @param input_size: Input size of the model. @type input_size: tuple[int] @param num_anchors: Number of anchors. @type num_anchors: int @param score_threshold: Confidence score threshold. @type score_threshold: float @param nms_threshold: Non-maximum suppression threshold. @type nms_threshold: float @return: Bounding boxes, confidence scores, and keypoints of detected objects. @rtype: tuple[np.ndarray, np.ndarray, np.ndarray]
module
depthai_nodes.node.parsers.utils.superanimal
function
get_top_values(heatmap)
Get the top values from the heatmap tensor. @param heatmap: Heatmap tensor. @type heatmap: np.ndarray @return: Y and X coordinates of the top values. @rtype: Tuple[np.ndarray, np.ndarray]
function
get_pose_prediction(heatmap, locref, scale_factors)
Get the pose prediction from the heatmap and locref tensors. Used for SuperAnimal model. @param heatmap: Heatmap tensor. @type heatmap: np.ndarray @param locref: Locref tensor. @type locref: np.ndarray @param scale_factors: Scale factors for the x and y axes. @type scale_factors: Tuple[float, float] @return: Pose prediction. @rtype: np.ndarray
module
depthai_nodes.node.parsers.utils.ufld
module
depthai_nodes.node.parsers.utils.xfeat
function
local_maximum_filter(x: np.ndarray, kernel_size: int) -> np.ndarray: np.ndarray
Apply a local maximum filter to the input array. @param x: Input array. @type x: np.ndarray @param kernel_size: Size of the local maximum filter. @type kernel_size: int @return: Output array after applying the local maximum filter. @rtype: np.ndarray
function
normgrid(x, H, W)
Normalize coords to [-1,1]. @param x: Input coordinates, shape (N, Hg, Wg, 2) @type x: np.ndarray @param H: Height of the output feature map @type H: int @param W: Width of the output feature map @type W: int @return: Normalized coordinates, shape (N, Hg, Wg, 2) @rtype: np.ndarray
function
bilinear(im, pos, H, W)
Given an input and a flow-field grid, computes the output using input values and pixel locations from grid. Supported only bilinear interpolation method to sample the input pixels. @param im: Input feature map, shape (N, C, H, W) @type im: np.ndarray @param pos: Point coordinates, shape (N, Hg, Wg, 2) @type pos: np.ndarray @param H: Height of the output feature map @type H: int @param W: Width of the output feature map @type W: int @return: A tensor with sampled points, shape (N, C, Hg, Wg) @rtype: np.ndarray
function
detect_and_compute(feats: np.ndarray, kpts: np.ndarray, heatmaps: np.ndarray, resize_rate_w: float, resize_rate_h: float, input_size: Tuple
[
int
,
int
], top_k: int = 4096) -> List[Dict[str, Any]]: List[Dict[str, Any]]
Detect and compute keypoints. @param feats: Features. @type feats: np.ndarray @param kpts: Keypoints. @type kpts: np.ndarray @param resize_rate_w: Resize rate for width. @type resize_rate_w: float @param resize_rate_h: Resize rate for height. @type resize_rate_h: float @param input_size: Input size. @type input_size: Tuple[int, int] @param top_k: Maximum number of keypoints to keep. @type top_k: int @return: List of dictionaries containing keypoints, scores, and descriptors. @rtype: List[Dict[str, Any]]
function
match(result1: Dict
[
str
,
Any
], result2: Dict
[
str
,
Any
], min_cossim: float = -1) -> Tuple[np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray]
Match keypoints. @param result1: Result 1. @type result1: Dict[str, Any] @param result2: Result 2. @type result2: Dict[str, Any] @param min_cossim: Minimum cosine similarity. @type min_cossim: float @return: Matched keypoints. @rtype: Tuple[np.ndarray, np.ndarray]
module
depthai_nodes.node.parsers.utils.yolo
variable
class
function
make_grid_numpy(ny: int, nx: int, na: int) -> np.ndarray: np.ndarray
Create a grid of shape (1, na, ny, nx, 2) @param ny: Number of y coordinates. @type ny: int @param nx: Number of x coordinates. @type nx: int @param na: Number of anchors. @type na: int @return: Grid. @rtype: np.ndarray
function
non_max_suppression(prediction: np.ndarray, conf_thres: float = 0.5, iou_thres: float = 0.45, classes: Optional
[
List
] = None, num_classes: int = 1, agnostic: bool = False, multi_label: bool = False, max_det: int = 300, max_time_img: float = 0.05, max_nms: int = 30000, max_wh: int = 7680, kpts_mode: bool = False, det_mode: bool = False) -> List[np.ndarray]: List[np.ndarray]
Performs Non-Maximum Suppression (NMS) on inference results. @param prediction: Prediction from the model, shape = (batch_size, boxes, xy+wh+...) @type prediction: np.ndarray @param conf_thres: Confidence threshold. @type conf_thres: float @param iou_thres: Intersection over union threshold. @type iou_thres: float @param classes: For filtering by classes. @type classes: Optional[List] @param num_classes: Number of classes. @type num_classes: int @param agnostic: Runs NMS on all boxes together rather than per class if True. @type agnostic: bool @param multi_label: Multilabel classification. @type multi_label: bool @param max_det: Limiting detections. @type max_det: int @param max_time_img: Maximum time for processing an image. @type max_time_img: float @param max_nms: Maximum number of boxes. @type max_nms: int @param max_wh: Maximum width and height. @type max_wh: int @param kpts_mode: Keypoints mode. @type kpts_mode: bool @param det_mode: Detection only mode. If True, the output will only contain bbox detections. @type det_mode: bool @return: An array of detections. If det_mode is False, the detections may include kpts or segmentation outputs. @rtype: List[np.ndarray]
function
parse_yolo_outputs(outputs: List
[
np.ndarray
], strides: List
[
int
], num_outputs: int, anchors: Optional
[
np.ndarray
] = None, kpts: Optional
[
List
[
np.ndarray
]
] = None, det_mode: bool = False, subtype: YOLOSubtype = YOLOSubtype.DEFAULT) -> np.ndarray: np.ndarray
Parse all outputs of an YOLO model (all channels). @param outputs: List of outputs of an YOLO model. @type outputs: List[np.ndarray] @param strides: List of strides. @type strides: List[int] @param num_outputs: Number of outputs of the model. @type num_outputs: int @param anchors: An optional array of anchors. @type anchors: Optional[np.ndarray] @param kpts: An optional list of keypoints for each output. @type kpts: Optional[List[np.ndarray]] @param det_mode: Detection only mode. @type det_mode: bool @param subtype: YOLO version. @type subtype: YOLOSubtype @return: Parsed output. @rtype: np.ndarray
function
parse_yolo_output(out: np.ndarray, stride: int, num_outputs: int, anchors: Optional
[
np.ndarray
] = None, head_id: int = -1, kpts: Optional
[
np.ndarray
] = None, det_mode: bool = False, subtype: YOLOSubtype = YOLOSubtype.DEFAULT) -> np.ndarray: np.ndarray
Parse a single channel output of an YOLO model. @param out: A single output of an YOLO model for the given channel. @type out: np.ndarray @param stride: Stride. @type stride: int @param num_outputs: Number of outputs of the model. @type num_outputs: int @param anchors: Anchors for the given head. @type anchors: Optional[np.ndarray] @param head_id: Head ID. @type head_id: int @param kpts: A single output of keypoints for the given channel. @type kpts: Optional[np.ndarray] @param det_mode: Detection only mode. @type det_mode: bool @param subtype: YOLO version. @type subtype: YOLOSubtype @return: Parsed output. @rtype: np.ndarray
function
parse_kpts(kpts: np.ndarray, n_keypoints: int, img_shape: Tuple
[
int
,
int
]) -> List[Tuple[int, int, float]]: List[Tuple[int, int, float]]
Parse keypoints. @param kpts: Result keypoints. @type kpts: np.ndarray @param n_keypoints: Number of keypoints. @type n_keypoints: int @param img_shape: Image shape of the model input in (height, width) format. @type img_shape: Tuple[int, int] @return: Parsed keypoints. @rtype: List[Tuple[int, int, float]]
function
decode_yolo_output(yolo_outputs: List
[
np.ndarray
], strides: List
[
int
], anchors: Optional
[
np.ndarray
] = None, kpts: Optional
[
List
[
np.ndarray
]
] = None, conf_thres: float = 0.5, iou_thres: float = 0.45, num_classes: int = 1, det_mode: bool = False, subtype: YOLOSubtype = YOLOSubtype.DEFAULT) -> np.ndarray: np.ndarray
Decode the output of an YOLO instance segmentation or pose estimation model. @param yolo_outputs: List of YOLO outputs. @type yolo_outputs: List[np.ndarray] @param strides: List of strides. @type strides: List[int] @param anchors: An optional array of anchors. @type anchors: Optional[np.ndarray] @param kpts: An optional list of keypoints. @type kpts: Optional[List[np.ndarray]] @param conf_thres: Confidence threshold. @type conf_thres: float @param iou_thres: Intersection over union threshold. @type iou_thres: float @param num_classes: Number of classes. @type num_classes: int @param det_mode: Detection only mode. If True, the output will only contain bbox detections. @type det_mode: bool @param subtype: YOLO version. @type subtype: YOLOSubtype @return: NMS output. @rtype: np.ndarray
class
depthai_nodes.node.parsers.utils.yolo.YOLOSubtype(str, enum.Enum)
module
depthai_nodes.node.parsers.utils.yunet
function
manual_product(args)
You can use this function instead of itertools.product.
function
generate_anchors(input_size: Tuple
[
int
,
int
], min_sizes: Optional
[
List
[
List
[
int
]
]
] = None, strides: Optional
[
List
[
int
]
] = None)
Generate a set of default bounding boxes, known as anchors. The code is taken from https://github.com/Kazuhito00/YuNet-ONNX-TFLite-Sample/tree/main @param input_size: A tuple representing the width and height of the input image. @type input_size: Tuple[int, int] @param min_sizes: A list of lists, where each inner list contains the minimum sizes of the anchors for different feature maps. If None then '[[10, 16, 24], [32, 48], [64, 96], [128, 192, 256]]' will be used. Defaults to None. @type min_sizes Optional[List[List[int]]] @param strides: Strides for each feature map layer. If None then '[8, 16, 32, 64]' will be used. Defaults to None. @type strides: Optional[List[int]] @return: Anchors. @rtype: np.ndarray
function
decode_detections(input_size: Tuple
[
int
,
int
], loc: np.ndarray, conf: np.ndarray, iou: np.ndarray, variance: Optional
[
List
[
float
]
] = None)
Decodes the output of an object detection model by converting the model's predictions (localization, confidence, and IoU scores) into bounding boxes, keypoints, and scores. The code is taken from https://github.com/Kazuhito00/YuNet-ONNX-TFLite-Sample/tree/main @param input_size: The size of the input image (height, width). @type input_size: tuple @param loc: The predicted locations (or offsets) of the bounding boxes. @type loc: np.ndarray @param conf: The predicted class confidence scores. @type conf: np.ndarray @param iou: The predicted IoU (Intersection over Union) scores. @type iou: np.ndarray @param variance: A list of variances used to decode the bounding box predictions. If None then [0.1,0.2] will be used. Defaults to None. @type variance: Optional[List[float]] @return: A tuple of bboxes, keypoints, and scores. - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height]. - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor. - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]
function
prune_detections(bboxes: np.ndarray, keypoints: np.ndarray, scores: np.ndarray, conf_threshold: float)
Prune detections based on confidence threshold. Parameters: @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param keypoints: A numpy array of shape (N, 10) containing the keypoints. @type np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type np.ndarray @param conf_threshold: The confidence threshold. @type float @return: A tuple of bboxes, keypoints, and scores. - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height]. - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor. - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]
function
format_detections(bboxes: np.ndarray, keypoints: np.ndarray, scores: np.ndarray, input_size: Tuple
[
int
,
int
])
Format detections into a list of dictionaries. @param bboxes: A numpy array of shape (N, 4) containing the bounding boxes. @type np.ndarray @param keypoints: A numpy array of shape (N, 10) containing the keypoints. @type np.ndarray @param scores: A numpy array of shape (N,) containing the scores. @type np.ndarray @param input_size: A tuple representing the width and height of the input image. @type input_size: tuple @return: A tuple of bboxes, keypoints, and scores. - bboxes: NumPy array of shape (N, 4) containing the decoded bounding boxes in the format [x_min, y_min, width, height]. - keypoints: A NumPy array of shape (N, 10) containing the decoded keypoint coordinates for each anchor. - scores: A NumPy array of shape (N, 1) containing the combined scores for each anchor. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]
module
depthai_nodes.node.parsers.xfeat
class
XFeatBaseParser
Base parser class for parsing the output of the XFeat model. It is the parent class of the XFeatMonoParser and XFeatStereoParser classes. Attributes ---------- reference_input : Node.Input Reference input for stereo mode. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. target_input : Node.Input Target input for stereo mode. It is a linking point to which the Neural Network's output is linked. It accepts the output of the Neural Network node. output_layer_feats : str Name of the output layer containing features. output_layer_keypoints : str Name of the output layer containing keypoints. output_layer_heatmaps : str Name of the output layer containing heatmaps. original_size : Tuple[float, float] Original image size. input_size : Tuple[float, float] Input image size. max_keypoints : int Maximum number of keypoints to keep. Error Handling -------------- **ValueError**: If the number of output layers is not E{3}. **ValueError**: If the original image size is not specified. **ValueError**: If the input image size is not specified. **ValueError**: If the maximum number of keypoints is not specified. **ValueError**: If the output layer containing features is not specified. **ValueError**: If the output layer containing keypoints is not specified. **ValueError**: If the output layer containing heatmaps is not specified.
class
depthai_nodes.node.parsers.xfeat.XFeatBaseParser(depthai_nodes.node.parsers.base_parser.BaseParser)
method
__init__(self, output_layer_feats: str = '', output_layer_keypoints: str = '', output_layer_heatmaps: str = '', original_size: Tuple
[
float
,
float
] = None, input_size: Tuple
[
float
,
float
] = (640, 352), max_keypoints: int = 4096)
Initializes the parser node.
variable
variable
variable
variable
variable
variable
property
reference_input
Returns the reference input.
property
target_input
Returns the target input.
method
reference_input.setter(self, reference_input: Optional
[
dai.Node.Input
])
Sets the reference input.
variable
method
target_input.setter(self, target_input: Optional
[
dai.Node.Input
])
Sets the target input.
method
setOutputLayerFeats(self, output_layer_feats: str)
Sets the output layer containing features. @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str
method
setOutputLayerKeypoints(self, output_layer_keypoints: str)
Sets the output layer containing keypoints. @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str
method
setOutputLayerHeatmaps(self, output_layer_heatmaps: str)
Sets the output layer containing heatmaps. @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str
method
setOriginalSize(self, original_size: Tuple
[
int
,
int
])
Sets the original image size. @param original_size: Original image size. @type original_size: Tuple[int, int]
method
setInputSize(self, input_size: Tuple
[
int
,
int
])
Sets the input image size. @param input_size: Input image size. @type input_size: Tuple[int, int]
method
setMaxKeypoints(self, max_keypoints: int)
Sets the maximum number of keypoints to keep. @param max_keypoints: Maximum number of keypoints. @type max_keypoints: int
method
build(self, head_config: Dict
[
str
,
Any
]) -> XFeatBaseParser: XFeatBaseParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: XFeatBaseParser
method
validateParams(self)
Validates the parameters.
method
extractTensors(self, output: dai.NNData) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: Tuple[np.ndarray, np.ndarray, np.ndarray]
Extracts the tensors from the output. It returns the features, keypoints, and heatmaps. It also handles the reshaping of the tensors by requesting the NCHW storage order. @param output: Output from the Neural Network node. @type output: dai.NNData @return: Tuple of features, keypoints, and heatmaps. @rtype: Tuple[np.ndarray, np.ndarray, np.ndarray]
module
depthai_nodes.node.parsing_neural_network
type variable
package
depthai_nodes.node.utils
function
copy_message(msg: dai.Buffer) -> dai.Buffer: dai.Buffer
Copies the incoming message and returns it. @param msg: The input message. @type msg: dai.Buffer @return: The copied message. @rtype: dai.Buffer
module
module
function
to_planar(arr: np.ndarray, shape: Tuple) -> np.ndarray: np.ndarray
Converts the input image `arr` (NumPy array) to the planar format expected by depthai. The image is resized to the dimensions specified in `shape`. @param arr: Input NumPy array (image). @type arr: np.ndarray @param shape: Target dimensions (width, height). @type shape: tuple @return: A 1D NumPy array with the planar image data. @rtype: np.ndarray
function
generate_script_content(platform: str, resize_width: int, resize_height: int, padding: float = 0, valid_labels: Optional
[
List
[
int
]
] = None) -> str: str
The function generates the script content for the dai.Script node. It is used to crop and resize the input image based on the detected object. It can also work with padding around the detection bounding box and filter detections by labels. @param platform: Target platform for the script. Supported values: 'rvc2', 'rvc4' @type platform: str @param resize_width: Target width for the resized image @type resize_width: int @param resize_height: Target height for the resized image @type resize_height: int @param padding: Additional padding around the detection in normalized coordinates (0-1) @type padding: float @param valid_labels: List of valid label indices to filter detections. If None, all detections are processed @type valid_labels: Optional[List[int]] @return: Generated script content as a string @rtype: str
function
nms_detections(detections: List
[
dai.ImgDetection
], conf_thresh = 0.3, iou_thresh = 0.4)
Applies Non-Maximum Suppression (NMS) on a list of dai.ImgDetection objects. @param detections: List of dai.ImgDetection objects. @type detections: list[dai.ImgDetection] @param conf_thresh: Confidence threshold for filtering boxes. @type conf_thresh: float @param iou_thresh: IoU threshold for Non-Maximum Suppression (NMS). @type iou_thresh: float @return: A list of dai.ImgDetection objects after applying NMS. @rtype: list[dai.ImgDetection]
module
depthai_nodes.node.utils.nms
function
nms(boxes, scores, iou_thresh)
Perform Non-Maximum Suppression (NMS). @param boxes: An ndarray of shape (N, 4), where each row is [xmin, ymin, xmax, ymax]. @type boxes: np.ndarray @param scores: An ndarray of shape (N,), containing the confidence scores for each box. @type scores: np.ndarray @param iou_thresh: The IoU threshold for Non-Maximum Suppression (NMS). @type iou_thresh: float @return: A list of indices of the boxes to keep after applying NMS. @rtype: list[int]
class
depthai_nodes.node.ApplyColormap(depthai.node.HostNode)
method
method
setColormap(self, colormap_value: int)
Sets the applied color mapping. @param colormap_value: OpenCV colormap enum value (e.g. cv2.COLORMAP_HOT) @type colormap_value: int
method
setMaxValue(self, max_value: int)
Sets the maximum frame value for normalization. @param max_value: Maximum frame value. @type max_value: int
method
setInstanceToSemanticMask(self, instance_to_semantic_mask: bool)
Sets the instance to semantic mask flag. @param instance_to_semantic_mask: If True, converts instance segmentation masks to semantic segmentation masks. @type instance_to_semantic_mask: bool
method
build(self, arr: dai.Node.Output) -> ApplyColormap: ApplyColormap
Configures the node connections. @param frame: Output with 2D array. @type frame: depthai.Node.Output @return: The node object with input stream connected @rtype: ApplyColormap
method
process(self, msg: dai.Buffer)
Processes incoming 2D arrays and converts them to colored frames. @param msg: The input message with a 2D array. @type msg: dai.ImgFrame or Map2D or ImgDetectionsExtended @param instance_to_semantic_segmentation: If True, converts instance segmentation masks to semantic segmentation masks. @type instance_to_semantic_segmentation: bool
class
depthai_nodes.node.DepthMerger(depthai.node.HostNode)
method
variable
variable
method
variable
method
class
depthai_nodes.node.HostSpatialsCalc
method
variable
variable
variable
variable
variable
method
setLowerThreshold(self, threshold_low: int)
Sets the lower threshold for depth values. @param threshold_low: The lower threshold for depth values. @type threshold_low: int
method
setUpperThreshold(self, threshold_high: int)
Sets the upper threshold for depth values. @param threshold_high: The upper threshold for depth values. @type threshold_high: int
method
setDeltaRoi(self, delta: int)
Sets the delta value for ROI calculation. @param delta: The delta value for ROI calculation. @type delta: int
method
calc_spatials(self, depthData: dai.ImgFrame, roi: List
[
int
], averaging_method: Callable = np.mean) -> Dict[str, float]: Dict[str, float]
Calculates spatial coordinates from depth data within the specified ROI. @param depthData: The depth data. @type depthData: dai.ImgFrame @param roi: The region of interest (ROI) or point. @type roi: List[int] @param averaging_method: The method for averaging the depth values. @type averaging_method: callable @return: The spatial coordinates. @rtype: Dict[str, float]
class
depthai_nodes.node.ImgDetectionsBridge(depthai.node.HostNode)
method
method
setIgnoreAngle(self, ignore_angle: bool) -> bool: bool
Sets whether to ignore the angle of the detections during transformation. @param ignore_angle: Whether to ignore the angle of the detections. @type ignore_angle: bool
method
build(self, msg: dai.Node.Output, ignore_angle: bool = False) -> ImgDetectionsBridge: ImgDetectionsBridge
Configures the node connections. @param msg: The input message for the ImgDetections object. @type msg: dai.Node.Output @param ignore_angle: Whether to ignore the angle of the detections. @type ignore_angle: bool @return: The node object with the transformed ImgDetections object. @rtype: ImgDetectionsBridge
variable
method
process(self, msg: dai.Buffer)
Transforms the incoming ImgDetections object. @param msg: The input message for the ImgDetections object. @type msg: dai.ImgDetections or ImgDetectionsExtended
class
depthai_nodes.node.ImgDetectionsFilter(depthai.node.HostNode)
method
method
setLabels(self, labels: List
[
int
], keep: bool)
Sets the labels to keep or reject. @param labels: The labels to keep or reject. @type labels: List[int] @param keep: Whether to keep or reject the labels. @type keep: bool
method
setConfidenceThreshold(self, confidence_threshold: float)
Sets the confidence threshold. @param confidence_threshold: The confidence threshold. @type confidence_threshold: float
method
setMaxDetections(self, max_detections: int)
Sets the maximum number of detections. @param max_detections: The maximum number of detections. @type max_detections: int
method
setSortByConfidence(self, sort_by_confidence: bool)
Sets whether to sort the detections by confidence before subsetting. @param sort_by_confidence: Whether to sort the detections. @type sort_by_confidence: bool
method
method
class
depthai_nodes.node.ImgFrameOverlay(depthai.node.HostNode)
method
method
setAlpha(self, alpha: float)
Sets the alpha. @param alpha: The weight of the background frame in the overlay. @type alpha: float
method
build(self, frame1: dai.Node.Output, frame2: dai.Node.Output, alpha: float = None) -> ImgFrameOverlay: ImgFrameOverlay
Configures the node connections. @param frame1: The input message for the background frame. @type frame1: dai.Node.Output @param frame2: The input message for the foreground frame. @type frame2: dai.Node.Output @param alpha: The weight of the background frame in the overlay. @type alpha: float @return: The node object with the background and foreground streams overlaid. @rtype: ImgFrameOverlay
method
process(self, frame1: dai.Buffer, frame2: dai.Buffer)
Processes incoming frames and overlays them. @param frame1: The input message for the background frame. @type frame1: dai.ImgFrame @param frame2: The input message for the foreground frame. @type frame2: dai.ImgFrame
class
depthai_nodes.node.ParserGenerator(depthai.node.ThreadedHostNode)
method
method
build(self, nn_archive: dai.NNArchive, head_index: Optional
[
int
] = None) -> Dict: Dict
Instantiates parsers based on the provided model archive. @param nn_archive: NN Archive of the model. @type nn_archive: dai.NNArchive @param head_index: Index of the head to be used for parsing. If not provided, each head will instantiate a separate parser. @type head_index: Optional[int] @return: A dictionary of instantiated parsers. @rtype: Dict[int, BaseParser]
method
class
depthai_nodes.node.BaseParser(depthai.node.ThreadedHostNode)
method
property
property
method
input.setter(self, node: dai.Node.Input)
Linking point to which the Neural Network's output is linked.
method
out.setter(self, node: dai.Node.Output)
Output node to which the processed network results are sent in the form of a DepthAI message.
method
build(self, head_config: Dict
[
str
,
Any
]) -> BaseParser: BaseParser
Configures the parser based on the specified head configuration. @param head_config: A dictionary containing configuration details relevant to the parser, including parameters and settings required for output parsing. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: BaseParser
method
run(self)
Parses the output from the neural network head. This method should be overridden by subclasses to implement the specific parsing logic. It accepts arbitrary keyword arguments for flexibility. @param kwargs: Arbitrary keyword arguments for the parsing process. @type kwargs: Any @return message: The parsed output message, as defined by the logic in the subclass. @rtype message: Any
class
depthai_nodes.node.ClassificationParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', classes: List
[
str
] = None, is_softmax: bool = True)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes: List of class names to be used for linking with their respective scores. Expected to be in the same order as Neural Network's output. If not provided, the message will only return sorted scores. @type classes: List[str] @param is_softmax: If False, the scores are converted to probabilities using softmax function. @type is_softmax: bool
variable
variable
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setClasses(self, classes: List
[
str
])
Sets the class names for the classification model. @param classes: List of class names to be used for linking with their respective scores. @type classes: List[str]
method
setSoftmax(self, is_softmax: bool)
Sets the softmax flag for the classification model. @param is_softmax: If False, the parser will convert the scores to probabilities using softmax function. @type is_softmax: bool
method
build(self, head_config: Dict
[
str
,
Any
]) -> ClassificationParser: ClassificationParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: ClassificationParser
method
class
depthai_nodes.node.ClassificationSequenceParser(depthai_nodes.node.ClassificationParser)
method
__init__(self, output_layer_name: str = '', classes: List
[
str
] = None, is_softmax: bool = True, ignored_indexes: List
[
int
] = None, remove_duplicates: bool = False, concatenate_classes: bool = False)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes: List of available classes for the model. @type classes: List[str] @param ignored_indexes: List of indexes to ignore during classification generation (e.g., background class, blank space). @type ignored_indexes: List[int] @param is_softmax: If False, the scores are converted to probabilities using softmax function. @type is_softmax: bool @param remove_duplicates: If True, removes consecutive duplicates from the sequence. @type remove_duplicates: bool @param concatenate_classes: If True, concatenates consecutive words based on the predicted spaces. @type concatenate_classes: bool
variable
variable
variable
method
setRemoveDuplicates(self, remove_duplicates: bool)
Sets the remove_duplicates flag for the classification sequence model. @param remove_duplicates: If True, removes consecutive duplicates from the sequence. @type remove_duplicates: bool
method
setIgnoredIndexes(self, ignored_indexes: List
[
int
])
Sets the ignored_indexes for the classification sequence model. @param ignored_indexes: A list of indexes to ignore during classification generation. @type ignored_indexes: List[int]
method
setConcatenateClasses(self, concatenate_classes: bool)
Sets the concatenate_classes flag for the classification sequence model. @param concatenate_classes: If True, concatenates consecutive classes into a single string. Used mostly for text processing. @type concatenate_classes: bool
method
build(self, head_config: Dict
[
str
,
Any
]) -> ClassificationSequenceParser: ClassificationSequenceParser
Configures the parser. @param head_config: The head configuration for the parser. The required keys are `classes`, `n_classes`, and `is_softmax`. In addition to these, there are three optional keys that are mostly used for text processing: `ignored_indexes`, `remove_duplicates` and `concatenate_classes`. @type head_config: Dict[str, Any] @return: Returns the instantiated parser with the correct configuration. @rtype: ClassificationParser
method
variable
class
depthai_nodes.node.DetectionParser(depthai_nodes.node.BaseParser)
method
__init__(self, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, label_names: Optional
[
List
[
str
]
] = None)
Initializes the parser node. @param conf_threshold: Confidence score threshold of detected bounding boxes. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int
variable
variable
variable
variable
method
setConfidenceThreshold(self, threshold: float)
Sets the confidence score threshold for detected objects. @param threshold: Confidence score threshold for detected objects. @type threshold: float
method
setIOUThreshold(self, threshold: float)
Sets the non-maximum suppression threshold. @param threshold: Non-maximum suppression threshold. @type threshold: float
method
setMaxDetections(self, max_det: int)
Sets the maximum number of detections to keep. @param max_det: Maximum number of detections to keep. @type max_det: int
method
setLabelNames(self, label_names: List
[
str
])
Sets the label names for detected objects. @param label_names: List of label names for detected objects. @type label_names: List[str]
method
build(self, head_config) -> DetectionParser: DetectionParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: DetectionParser
method
class
depthai_nodes.node.EmbeddingsParser(depthai_nodes.node.BaseParser)
method
__init__(self)
Initialize the EmbeddingsParser node.
variable
method
setOutputLayerNames(self, output_layer_name: str)
Sets the output layer name for the parser. @param output_layer_name: The output layer name for the parser. @type output_layer_name: str
method
build(self, head_config: Dict
[
str
,
Any
]) -> EmbeddingsParser: EmbeddingsParser
Sets the head configuration for the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: EmbeddingsParser
method
class
depthai_nodes.node.FastSAMParser(depthai_nodes.node.BaseParser)
method
__init__(self, conf_threshold: int = 0.5, n_classes: int = 1, iou_threshold: float = 0.5, mask_conf: float = 0.5, prompt: str = 'everything', points: Optional
[
Tuple
[
int
,
int
]
] = None, point_label: Optional
[
int
] = None, bbox: Optional
[
Tuple
[
int
,
int
,
int
,
int
]
] = None, yolo_outputs: List
[
str
] = None, mask_outputs: List
[
str
] = None, protos_output: str = 'protos_output')
Initializes the parser node. @param conf_threshold: The confidence threshold for the detections @type conf_threshold: float @param n_classes: The number of classes in the model @type n_classes: int @param iou_threshold: The intersection over union threshold @type iou_threshold: float @param mask_conf: The mask confidence threshold @type mask_conf: float @param prompt: The prompt type @type prompt: str @param points: The points @type points: Optional[Tuple[int, int]] @param point_label: The point label @type point_label: Optional[int] @param bbox: The bounding box @type bbox: Optional[Tuple[int, int, int, int]] @param yolo_outputs: The YOLO outputs @type yolo_outputs: List[str] @param mask_outputs: The mask outputs @type mask_outputs: List[str] @param protos_output: The protos output @type protos_output: str
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
setConfidenceThreshold(self, threshold: float)
Sets the confidence score threshold. @param threshold: Confidence score threshold. @type threshold: float
method
setNumClasses(self, n_classes: int)
Sets the number of classes in the model. @param numClasses: The number of classes in the model. @type numClasses: int
method
setIouThreshold(self, iou_threshold: float)
Sets the intersection over union threshold. @param iou_threshold: The intersection over union threshold. @type iou_threshold: float
method
setMaskConfidence(self, mask_conf: float)
Sets the mask confidence threshold. @param mask_conf: The mask confidence threshold. @type mask_conf: float
method
setPrompt(self, prompt: str)
Sets the prompt type. @param prompt: The prompt type @type prompt: str
method
setPoints(self, points: Tuple
[
int
,
int
])
Sets the points. @param points: The points @type points: Tuple[int, int]
method
setPointLabel(self, point_label: int)
Sets the point label. @param point_label: The point label @type point_label: int
method
setBoundingBox(self, bbox: Tuple
[
int
,
int
,
int
,
int
])
Sets the bounding box. @param bbox: The bounding box @type bbox: Tuple[int, int, int, int]
method
setYoloOutputs(self, yolo_outputs: List
[
str
])
Sets the YOLO outputs. @param yolo_outputs: The YOLO outputs @type yolo_outputs: List[str]
method
setMaskOutputs(self, mask_outputs: List
[
str
])
Sets the mask outputs. @param mask_outputs: The mask outputs @type mask_outputs: List[str]
method
setProtosOutput(self, protos_output: str)
Sets the protos output. @param protos_output: The protos output @type protos_output: str
method
build(self, head_config: Dict
[
str
,
Any
]) -> FastSAMParser: FastSAMParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: FastSAMParser
method
class
depthai_nodes.node.HRNetParser(depthai_nodes.node.KeypointParser)
method
__init__(self, output_layer_name: str = '', score_threshold: float = 0.5)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param score_threshold: Confidence score threshold for detected keypoints. @type score_threshold: float
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
variable
method
build(self, head_config: Dict
[
str
,
Any
]) -> HRNetParser: HRNetParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: HRNetParser
method
variable
class
depthai_nodes.node.ImageOutputParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', output_is_bgr: bool = False)
Initializes the parser node. param output_layer_name: Name of the output layer relevant to the parser. type output_layer_name: str @param output_is_bgr: Flag indicating if the output image is in BGR. @type output_is_bgr: bool
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setBGROutput(self)
Sets the flag indicating that output image is in BGR.
method
build(self, head_config: Dict
[
str
,
Any
]) -> ImageOutputParser: ImageOutputParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: ImageOutputParser
method
class
depthai_nodes.node.KeypointParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', scale_factor: float = 1.0, n_keypoints: int = None, score_threshold: float = None)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float @param n_keypoints: Number of keypoints. @type n_keypoints: int
variable
variable
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setScaleFactor(self, scale_factor: float)
Sets the scale factor to divide the keypoints by. @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float
method
setNumKeypoints(self, n_keypoints: int)
Sets the number of keypoints. @param n_keypoints: Number of keypoints. @type n_keypoints: int
method
setScoreThreshold(self, threshold: float)
Sets the confidence score threshold for the detected body keypoints. @param threshold: Confidence score threshold for detected keypoints. @type threshold: float
method
build(self, head_config: Dict
[
str
,
Any
]) -> KeypointParser: KeypointParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: KeypointParser
method
class
depthai_nodes.node.LaneDetectionParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', row_anchors: List
[
int
] = None, griding_num: int = None, cls_num_per_lane: int = None, input_size: Tuple
[
int
,
int
] = None)
Initializes the lane detection parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param row_anchors: List of row anchors. @type row_anchors: List[int] @param griding_num: Griding number. @type griding_num: int @param cls_num_per_lane: Number of points per lane. @type cls_num_per_lane: int @param input_size: Input size (width,height). @type input_size: Tuple[int, int]
variable
variable
variable
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Set the output layer name for the lane detection model. @param output_layer_name: Name of the output layer. @type output_layer_name: str
method
setRowAnchors(self, row_anchors: List
[
int
])
Set the row anchors for the lane detection model. @param row_anchors: List of row anchors. @type row_anchors: List[int]
method
setGridingNum(self, griding_num: int)
Set the griding number for the lane detection model. @param griding_num: Griding number. @type griding_num: int
method
setClsNumPerLane(self, cls_num_per_lane: int)
Set the number of points per lane for the lane detection model. @param cls_num_per_lane: Number of classes per lane. @type cls_num_per_lane: int
method
setInputSize(self, input_size: Tuple
[
int
,
int
])
Set the input size for the lane detection model. @param input_size: Input size (width,height). @type input_size: Tuple[int, int]
method
build(self, head_config: Dict
[
str
,
Any
]) -> LaneDetectionParser: LaneDetectionParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: LaneDetectionParser
variable
variable
method
class
depthai_nodes.node.MapOutputParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', min_max_scaling: bool = False)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param min_max_scaling: If True, the map is scaled to the range [0, 1]. @type min_max_scaling: bool
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setMinMaxScaling(self, min_max_scaling: bool)
Sets the min_max_scaling flag. @param min_max_scaling: If True, the map is scaled to the range [0, 1]. @type min_max_scaling: bool
method
build(self, head_config: Dict
[
str
,
Any
]) -> MapOutputParser: MapOutputParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MapOutputParser
method
class
depthai_nodes.node.MPPalmDetectionParser(depthai_nodes.node.DetectionParser)
method
__init__(self, output_layer_names: List
[
str
] = None, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, scale: int = 192)
Initializes the parser node. @param output_layer_names: Names of the output layers relevant to the parser. @type output_layer_names: List[str] @param conf_threshold: Confidence score threshold for detected hands. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param scale: Scale of the input image. @type scale: int
variable
variable
variable
variable
variable
variable
method
setOutputLayerNames(self, output_layer_names: List
[
str
])
Sets the output layer name(s) for the parser. @param output_layer_names: The name of the output layer(s) from which the scores are extracted. @type output_layer_names: List[str]
method
setScale(self, scale: int)
Sets the scale of the input image. @param scale: Scale of the input image. @type scale: int
method
build(self, head_config: Dict
[
str
,
Any
]) -> MPPalmDetectionParser: MPPalmDetectionParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MPPalmDetectionParser
method
class
depthai_nodes.node.MLSDParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_tpmap: str = '', output_layer_heat: str = '', topk_n: int = 200, score_thr: float = 0.1, dist_thr: float = 20.0)
Initializes the parser node. @param topk_n: Number of top candidates to keep. @type topk_n: int @param score_thr: Confidence score threshold for detected lines. @type score_thr: float @param dist_thr: Distance threshold for merging lines. @type dist_thr: float
variable
variable
variable
variable
variable
method
setOutputLayerTPMap(self, output_layer_tpmap: str)
Sets the name of the output layer containing the tpMap tensor. @param output_layer_tpmap: Name of the output layer containing the tpMap tensor. @type output_layer_tpmap: str
method
setOutputLayerHeat(self, output_layer_heat: str)
Sets the name of the output layer containing the heat tensor. @param output_layer_heat: Name of the output layer containing the heat tensor. @type output_layer_heat: str
method
setTopK(self, topk_n: int)
Sets the number of top candidates to keep. @param topk_n: Number of top candidates to keep. @type topk_n: int
method
setScoreThreshold(self, score_thr: float)
Sets the confidence score threshold for detected lines. @param score_thr: Confidence score threshold for detected lines. @type score_thr: float
method
setDistanceThreshold(self, dist_thr: float)
Sets the distance threshold for merging lines. @param dist_thr: Distance threshold for merging lines. @type dist_thr: float
method
build(self, head_config: Dict
[
str
,
Any
]) -> MLSDParser: MLSDParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: MLSDParser
method
class
depthai_nodes.node.PPTextDetectionParser(depthai_nodes.node.DetectionParser)
method
__init__(self, output_layer_name: str = '', conf_threshold: float = 0.5, mask_threshold: float = 0.25, max_det: int = 100)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param conf_threshold: The threshold for bounding boxes. @type conf_threshold: float @param mask_threshold: The threshold for the mask. @type mask_threshold: float @param max_det: The maximum number of candidate bounding boxes. @type max_det:
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setMaskThreshold(self, mask_threshold: float = 0.25)
Sets the mask threshold for creating the mask from model output probabilities. @param threshold: The threshold for the mask. @type threshold: float
method
build(self, head_config: Dict
[
str
,
Any
]) -> PPTextDetectionParser: PPTextDetectionParser
Configures the parser. @param config: The head configuration for the parser. @type config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: PPTextDetectionParser
method
class
depthai_nodes.node.RegressionParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '')
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name : str
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str
method
build(self, head_config: Dict
[
str
,
Any
]) -> RegressionParser: RegressionParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: RegressionParser
method
class
depthai_nodes.node.SCRFDParser(depthai_nodes.node.DetectionParser)
method
__init__(self, output_layer_names: List
[
str
] = None, conf_threshold: float = 0.5, iou_threshold: float = 0.5, max_det: int = 100, input_size: Tuple
[
int
,
int
] = (640, 640), feat_stride_fpn: Tuple = (8, 16, 32), num_anchors: int = 2)
Initializes the parser node. @param output_layer_names: Names of the output layers relevant to the parser. @type output_layer_names: List[str] @param conf_threshold: Confidence score threshold for detected faces. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param input_size: Input size of the model. @type input_size: tuple @param feat_stride_fpn: List of the feature strides. @type feat_stride_fpn: tuple @param num_anchors: Number of anchors. @type num_anchors: int
variable
variable
variable
variable
variable
method
setOutputLayerNames(self, output_layer_names: List
[
str
])
Sets the output layer name(s) for the parser. @param output_layer_names: The name of the output layer(s) to be used. @type output_layer_names: List[str]
method
setInputSize(self, input_size: Tuple
[
int
,
int
])
Sets the input size of the model. @param input_size: Input size of the model. @type input_size: list
method
setFeatStrideFPN(self, feat_stride_fpn: List
[
int
])
Sets the feature stride of the FPN. @param feat_stride_fpn: Feature stride of the FPN. @type feat_stride_fpn: list
method
setNumAnchors(self, num_anchors: int)
Sets the number of anchors. @param num_anchors: Number of anchors. @type num_anchors: int
method
build(self, head_config: Dict
[
str
,
Any
]) -> SCRFDParser: SCRFDParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SCRFDParser
method
class
depthai_nodes.node.SegmentationParser(depthai_nodes.node.BaseParser)
method
__init__(self, output_layer_name: str = '', classes_in_one_layer: bool = False)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param classes_in_one_layer: Whether all classes are in one layer in the multi- class segmentation model. Default is False. If True, the parser will use np.max instead of np.argmax to get the class map. @type classes_in_one_layer: bool
variable
variable
method
setOutputLayerName(self, output_layer_name: str)
Sets the name of the output layer. @param output_layer_name: The name of the output layer. @type output_layer_name: str
method
setClassesInOneLayer(self, classes_in_one_layer: bool)
Sets the flag indicating whether all classes are in one layer. @param classes_in_one_layer: Whether all classes are in one layer. @type classes_in_one_layer: bool
method
build(self, head_config: Dict
[
str
,
Any
]) -> SegmentationParser: SegmentationParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SegmentationParser
method
class
depthai_nodes.node.SuperAnimalParser(depthai_nodes.node.KeypointParser)
method
__init__(self, output_layer_name: str = '', scale_factor: float = 256.0, n_keypoints: int = 39, score_threshold: float = 0.5)
Initializes the parser node. @param output_layer_name: Name of the output layer relevant to the parser. @type output_layer_name: str @param n_keypoints: Number of keypoints. @type n_keypoints: int @param score_threshold: Confidence score threshold for detected keypoints. @type score_threshold: float @param scale_factor: Scale factor to divide the keypoints by. @type scale_factor: float
method
build(self, head_config: Dict
[
str
,
Any
]) -> SuperAnimalParser: SuperAnimalParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: SuperAnimalParser
method
variable
class
depthai_nodes.node.XFeatMonoParser(depthai_nodes.node.parsers.xfeat.XFeatBaseParser)
method
__init__(self, output_layer_feats: str = 'feats', output_layer_keypoints: str = 'keypoints', output_layer_heatmaps: str = 'heatmaps', original_size: Tuple
[
float
,
float
] = None, input_size: Tuple
[
float
,
float
] = (640, 352), max_keypoints: int = 4096)
Initializes the XFeatParser node. @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str @param original_size: Original image size. @type original_size: Tuple[float, float] @param input_size: Input image size. @type input_size: Tuple[float, float] @param max_keypoints: Maximum number of keypoints to keep. @type max_keypoints: int
variable
variable
method
setTrigger(self)
Sets the trigger to set the reference frame.
method
class
depthai_nodes.node.XFeatStereoParser(depthai_nodes.node.parsers.xfeat.XFeatBaseParser)
method
__init__(self, output_layer_feats: str = 'feats', output_layer_keypoints: str = 'keypoints', output_layer_heatmaps: str = 'heatmaps', original_size: Tuple
[
float
,
float
] = None, input_size: Tuple
[
float
,
float
] = (640, 352), max_keypoints: int = 4096)
Initializes the XFeatParser node. @param output_layer_feats: Name of the output layer containing features. @type output_layer_feats: str @param output_layer_keypoints: Name of the output layer containing keypoints. @type output_layer_keypoints: str @param output_layer_heatmaps: Name of the output layer containing heatmaps. @type output_layer_heatmaps: str @param original_size: Original image size. @type original_size: Tuple[float, float] @param input_size: Input image size. @type input_size: Tuple[float, float] @param max_keypoints: Maximum number of keypoints to keep. @type max_keypoints: int
method
class
depthai_nodes.node.YOLOExtendedParser(depthai_nodes.node.BaseParser)
method
__init__(self, conf_threshold: float = 0.5, n_classes: int = 1, label_names: Optional
[
List
[
str
]
] = None, iou_threshold: float = 0.5, mask_conf: float = 0.5, n_keypoints: int = 17, anchors: Optional
[
List
[
List
[
List
[
float
]
]
]
] = None, subtype: str = '')
Initialize the parser node. @param conf_threshold: The confidence threshold for the detections @type conf_threshold: float @param n_classes: The number of classes in the model @type n_classes: int @param label_names: The names of the classes @type label_names: Optional[List[str]] @param iou_threshold: The intersection over union threshold @type iou_threshold: float @param mask_conf: The mask confidence threshold @type mask_conf: float @param n_keypoints: The number of keypoints in the model @type n_keypoints: int @param anchors: The anchors for the YOLO model @type anchors: Optional[List[List[List[float]]]] @param subtype: The version of the YOLO model @type subtype: str
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
setOutputLayerNames(self, output_layer_names: List
[
str
])
Sets the output layer names for the parser. @param output_layer_names: The output layer names for the parser. @type output_layer_names: List[str]
method
setConfidenceThreshold(self, threshold: float)
Sets the confidence score threshold for detected faces. @param threshold: Confidence score threshold for detected faces. @type threshold: float
method
setNumClasses(self, n_classes: int)
Sets the number of classes in the model. @param numClasses: The number of classes in the model. @type numClasses: int
method
setIouThreshold(self, iou_threshold: float)
Sets the intersection over union threshold. @param iou_threshold: The intersection over union threshold. @type iou_threshold: float
method
setMaskConfidence(self, mask_conf: float)
Sets the mask confidence threshold. @param mask_conf: The mask confidence threshold. @type mask_conf: float
method
setNumKeypoints(self, n_keypoints: int)
Sets the number of keypoints in the model. @param n_keypoints: The number of keypoints in the model. @type n_keypoints: int
method
setAnchors(self, anchors: List
[
List
[
List
[
float
]
]
])
Sets the anchors for the YOLO model. @param anchors: The anchors for the YOLO model. @type anchors: List[List[List[float]]]
method
setSubtype(self, subtype: str)
Sets the subtype of the YOLO model. @param subtype: The subtype of the YOLO model. @type subtype: YOLOSubtype
method
setLabelNames(self, label_names: List
[
str
])
Sets the names of the classes. @param label_names: The names of the classes. @type label_names: List[str]
method
build(self, head_config: Dict
[
str
,
Any
])
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: YOLOExtendedParser
method
class
depthai_nodes.node.YuNetParser(depthai_nodes.node.DetectionParser)
method
__init__(self, conf_threshold: float = 0.8, iou_threshold: float = 0.3, max_det: int = 5000, input_size: Tuple
[
int
,
int
] = None, loc_output_layer_name: str = None, conf_output_layer_name: str = None, iou_output_layer_name: str = None)
Initializes the parser node. @param conf_threshold: Confidence score threshold for detected faces. @type conf_threshold: float @param iou_threshold: Non-maximum suppression threshold. @type iou_threshold: float @param max_det: Maximum number of detections to keep. @type max_det: int @param input_size: Input size of the model (width, height). @type input_size: Tuple[int, int] @param loc_output_layer_name: Output layer name for the location predictions. @type loc_output_layer_name: str @param conf_output_layer_name: Output layer name for the confidence predictions. @type conf_output_layer_name: str @param iou_output_layer_name: Output layer name for the IoU predictions. @type iou_output_layer_name: str
variable
variable
variable
variable
variable
method
setInputSize(self, input_size: Tuple
[
int
,
int
])
Sets the input size of the model. @param input_size: Input size of the model (width, height). @type input_size: list
method
setOutputLayerLoc(self, loc_output_layer_name: str)
Sets the name of the output layer containing the location predictions. @param loc_output_layer_name: Output layer name for the loc tensor. @type loc_output_layer_name: str
method
setOutputLayerConf(self, conf_output_layer_name: str)
Sets the name of the output layer containing the confidence predictions. @param conf_output_layer_name: Output layer name for the conf tensor. @type conf_output_layer_name: str
method
setOutputLayerIou(self, iou_output_layer_name: str)
Sets the name of the output layer containing the IoU predictions. @param iou_output_layer_name: Output layer name for the IoU tensor. @type iou_output_layer_name: str
method
build(self, head_config: Dict
[
str
,
Any
]) -> YuNetParser: YuNetParser
Configures the parser. @param head_config: The head configuration for the parser. @type head_config: Dict[str, Any] @return: The parser object with the head configuration set. @rtype: YuNetParser
variable
variable
method
class
depthai_nodes.node.ParsingNeuralNetwork(depthai.node.ThreadedHostNode)
method
__init__(self, args, kwargs)
Initializes the ParsingNeuralNetwork node. NeuralNetwork node is created in the pipeline. @param args: Arguments to be passed to the ThreadedHostNode class. @param kwargs: Keyword arguments to be passed to the ThreadedHostNode class.
property
property
property
property
property
method
getNumInferenceThreads(self) -> int: int
Returns number of inference threads of the NeuralNetwork node.
method
method
getOutput(self, head: int) -> dai.Node.Output: dai.Node.Output
Obtains output of a parser for specified NeuralNetwork model head.
method
setBackend(self, setBackend: str)
Sets the backend of the NeuralNetwork node.
method
setBackendProperties(self, setBackendProperties: Dict
[
str
,
str
])
Sets the backend properties of the NeuralNetwork node.
method
setBlob(self, blob: Union
[
Path
,
dai.OpenVINO.Blob
])
Sets the blob of the NeuralNetwork node.
method
setBlobPath(self, path: Path)
Sets the blob path of the NeuralNetwork node.
method
setFromModelZoo(self, description: dai.NNModelDescription, useCached: bool)
Sets the model from the model zoo of the NeuralNetwork node.
method
setModelPath(self, modelPath: Path)
Sets the model path of the NeuralNetwork node.
method
setNNArchive(self, nnArchive: dai.NNArchive, numShaves: Optional
[
int
] = None)
Sets the NNArchive of the ParsingNeuralNetwork node. Updates the NeuralNetwork node and parser nodes. @param nnArchive: Neural network archive containing the model and its configuration. @type nnArchive: dai.NNArchive @param numShaves: Optional number of shaves to allocate for the neural network. If not provided, uses default allocation. @type numShaves: Optional[int]
method
setNumInferenceThreads(self, numThreads: int)
Sets the number of inference threads of the NeuralNetwork node.
method
setNumNCEPerInferenceThread(self, numNCEPerThread: int)
Sets the number of NCE per inference thread of the NeuralNetwork node.
method
setNumPoolFrames(self, numFrames: int)
Sets the number of pool frames of the NeuralNetwork node.
method
setNumShavesPerInferenceThread(self, numShavesPerInferenceThread: int)
Sets the number of shaves per inference thread of the NeuralNetwork node.
method
build(self, input: Union
[
dai.Node.Output
,
dai.node.Camera
], nn_source: Union
[
dai.NNModelDescription
,
dai.NNArchive
,
str
], fps: Optional
[
float
] = None) -> ParsingNeuralNetwork: ParsingNeuralNetwork
Builds the underlying NeuralNetwork node and creates parser nodes for each model head. @param input: Node's input. It is a linking point to which the NeuralNetwork is linked. It accepts the output of a Camera node. @type input: Node.Input @param nn_source: NNModelDescription object containing the HubAI model descriptors, NNArchive object of the model, or HubAI model slug in form of <model_slug>:<model_version_slug> or <model_slug>:<model_version_slug>:<model_instance_hash>. @type nn_source: Union[dai.NNModelDescription, dai.NNArchive, str] @param fps: FPS limit for the model runtime. @type fps: int @return: Returns the ParsingNeuralNetwork object. @rtype: ParsingNeuralNetwork @raise ValueError: If the nn_source is not a NNModelDescription or NNArchive object.
method
run(self)
Methods inherited from ThreadedHostNode. Method runs with start of the pipeline.
class
depthai_nodes.node.TilesPatcher(depthai.node.HostNode)
method
__init__(self)
Initializes the TilesPatcher node, sets default thresholds for confidence and IOU, and initializes buffers for tile processing.
variable
variable
variable
variable
variable
variable
variable
method
build(self, tile_manager: Tiling, nn: dai.Node.Output, conf_thresh = 0.3, iou_thresh = 0.4) -> TilesPatcher: TilesPatcher
Configures the TilesPatcher node with the tile manager and links the neural network's output. @param tile_manager: The tiling manager responsible for tile positions and dimensions. @type tile_manager: Tiling @param nn: The output of the neural network node from which detections are received. @type nn: dai.Node.Output @param conf_thresh: Confidence threshold for filtering detections (default: 0.3). @type conf_thresh: float @param iou_thresh: IOU threshold for non-max suppression (default: 0.4). @type iou_thresh: float @return: Returns self for method chaining. @rtype: TilesPatcher
method
process(self, nn_output: dai.ImgDetections)
Processes each neural network output (detections) by mapping them from tiled patches back into the global frame and buffering them until all tiles for the current frame are processed. @param nn_output: The detections from the neural network's output. @type nn_output: dai.ImgDetections
class
depthai_nodes.node.Tiling(depthai.node.HostNode)
method
__init__(self)
Initializes the Tiling node, setting default attributes like overlap, grid size, and tile positions.
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
build(self, overlap: float, img_output: dai.Node.Output, grid_size: Tuple, img_shape: Tuple, nn_shape: Tuple, global_detection: bool = False, grid_matrix = None) -> Tiling: Tiling
Configures the Tiling node with grid size, overlap, image and neural network shapes, and other necessary parameters. @param overlap: Overlap between adjacent tiles, valid in [0,1). @type overlap: float @param img_output: The node from which the frames are sent. @type img_output: dai.Node.Output @param grid_size: Number of tiles horizontally and vertically. @type grid_size: tuple @param img_shape: Shape of the original image. @type img_shape: tuple @param nn_shape: Shape of the neural network input. @type nn_shape: tuple @param global_detection: Whether to perform global detection. Defaults to False. @type global_detection: bool @param grid_matrix: Predefined matrix for tiling. Defaults to None. @type grid_matrix: list or None @return: Returns self for method chaining. @rtype: Tiling
method
process(self, img_frame)
Processes the input frame by cropping and tiling it, and sending each tile to the neural network input. @param img_frame: The frame to be sent to a neural network. @type img_frame: dai.ImgFrame