NNComponent¶
NNComponent abstracts sourcing & decoding AI models, creating a DepthAI API node for neural inferencing, object tracking, and MultiStage pipelines setup. It also supports Roboflow integration.
DepthAI API nodes¶
For neural inference, NNComponent will use DepthAI API node:
If we are using MobileNet-SSD based AI model, this component will create MobileNetDetectionNetwork (or MobileNetSpatialDetectionNetwork if
spatial
argument is set).If we are using YOLO based AI model, this component will create YoloDetectionNetwork (or YoloSpatialDetectionNetwork if
spatial
argument is set).If it’s none of the above, component will create NeuralNetwork node.
If tracker
argument is set and we have YOLO/MobileNet-SSD based model, this component will also create ObjectTracker node,
and connect the two nodes togeter.
Usage¶
from depthai_sdk import OakCamera, ResizeMode
with OakCamera(recording='cars-tracking-above-01') as oak:
color = oak.create_camera('color')
nn = oak.create_nn('vehicle-detection-0202', color, tracker=True)
nn.config_nn(resize_mode='stretch')
oak.visualize([nn.out.tracker, nn.out.passthrough], fps=True)
oak.start(blocking=True)
Component outputs¶
main
- Default output. Streams NN results and high-res frames that were downscaled and used for inferencing. Produces DetectionPacket or TwoStagePacket (if it’s 2. stage NNComponent).passthrough
- Default output. Streams NN results and passthrough frames (frames used for inferencing). Produces DetectionPacket or TwoStagePacket (if it’s 2. stage NNComponent).spatials
- Streams depth and bounding box mappings (SpatialDetectionNework.boundingBoxMapping
). Produces SpatialBbMappingPacket.twostage_crops
- Streams 2. stage cropped frames to the host. Produces FramePacket.tracker
- Streams ObjectTracker’s tracklets and high-res frames that were downscaled and used for inferencing. Produces TrackerPacket.nn_data
- Streams NN raw output. Produces NNDataPacket.
Decoding outputs¶
NNComponent allows user to define their own decoding functions. There is a set of standardized outputs:
Note
This feature is still in development and is not guaranteed to work correctly in all cases.
Example usage:
import numpy as np
from depthai import NNData
from depthai_sdk import OakCamera
from depthai_sdk.classes import Detections
def decode(nn_data: NNData):
layer = nn_data.getFirstLayerFp16()
results = np.array(layer).reshape((1, 1, -1, 7))
dets = Detections(nn_data)
for result in results[0][0]:
if result[2] > 0.5:
dets.add(result[1], result[2], result[3:])
return dets
def callback(packet: DetectionPacket, visualizer: Visualizer):
detections: Detections = packet.img_detections
...
with OakCamera() as oak:
color = oak.create_camera('color')
nn = oak.create_nn(..., color, decode_fn=decode)
oak.visualize(nn, callback=callback)
oak.start(blocking=True)
Reference¶
-
class
depthai_sdk.components.
NNComponent
(device, pipeline, model, input, nn_type=None, decode_fn=None, tracker=False, spatial=None, replay=None, args=None)¶ -
get_name
()¶
-
get_labels
()¶
-
config_multistage_nn
(debug=False, labels=None, scale_bb=None, num_frame_pool=None)¶ Configures the MultiStage NN pipeline. Available if the input to this NNComponent is Detection NNComponent.
- Parameters
debug (bool, default False) – Debug script node
labels (List[int], optional) – Crop & run inference only on objects with these labels
scale_bb (Tuple[int, int], optional) – Scale detection bounding boxes (x, y) before cropping the frame. In %.
num_frame_pool (int, optional) – Number of frames to pool for inference. If None, will use the default value.
- Return type
-
config_tracker
(tracker_type=None, track_labels=None, assignment_policy=None, max_obj=None, threshold=None, apply_tracking_filter=None, forget_after_n_frames=None, calculate_speed=None)¶ Configure Object Tracker node (if it’s enabled).
- Parameters
tracker_type (dai.TrackerType, optional) – Set object tracker type
track_labels (List[int], optional) – Set detection labels to track
assignment_policy (dai.TrackerType, optional) – Set object tracker ID assignment policy
max_obj (int, optional) – Set max objects to track. Max 60.
threshold (float, optional) – Specify tracker threshold. Default: 0.0
apply_tracking_filter (bool, optional) – Set whether to apply Kalman filter to the tracked objects. Done on the host.
forget_after_n_frames (int, optional) – Set how many frames to track an object before forgetting it.
calculate_speed (bool, optional) – Set whether to calculate object speed. Done on the host.
- Return type
-
config_yolo_from_metadata
(metadata)¶ Configures (Spatial) Yolo Detection Network node with a dictionary. Calls config_yolo().
- Parameters
metadata (Dict) –
- Return type
-
config_yolo
(num_classes, coordinate_size, anchors, masks, iou_threshold, conf_threshold=None)¶ Configures (Spatial) Yolo Detection Network node.
-
config_nn
(conf_threshold=None, resize_mode=None)¶ Configures the Detection Network node.
- Parameters
- Return type
-
config_spatial
(bb_scale_factor=None, lower_threshold=None, upper_threshold=None, calc_algo=None)¶ Configures the Spatial Detection Network node.
- Parameters
bb_scale_factor (float, optional) – Specifies scale factor for detected bounding boxes (0..1]
lower_threshold (int, optional) – Specifies lower threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data
upper_threshold (int, optional) – Specifies upper threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data
calc_algo (dai.SpatialLocationCalculatorAlgorithm, optional) – Specifies spatial location calculator algorithm: Average/Min/Max
- Return type
-
get_bbox
()¶ - Return type
depthai_sdk.visualize.bbox.BoundingBox
-
class
Out
(nn_component)¶ -
class
MainOut
(component)¶ Default output. Streams NN results and high-res frames that were downscaled and used for inferencing. Produces DetectionPacket or TwoStagePacket (if it’s 2. stage NNComponent).
-
class
PassThroughOut
(component)¶
-
class
ImgManipOut
(component)¶
-
class
InputOut
(component)¶
-
class
SpatialOut
(component)¶
-
class
TwoStageOut
(component)¶
-
class
TrackerOut
(component)¶
-
class
EncodedOut
(component)¶
-
class
NnDataOut
(component)¶
-
class
-
is_multi_stage
()¶
-
General (standarized) NN outputs, to be used for higher-level abstractions (eg. automatic visualization of results). “SDK supported NN models” will have to have standard NN output, so either dai.ImgDetections, or one of the outputs below. If the latter, model json config will incldue handler.py logic for decoding to the standard NN output. These will be integrated into depthai-core, bonus points for on-device decoding of some popular models.
-
class
depthai_sdk.classes.nn_results.
Detection
(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType])¶ -
img_detection
: Union[None, depthai.ImgDetection, depthai.SpatialImgDetection]¶
-
bbox
: depthai_sdk.visualize.bbox.BoundingBox¶
-
ts
: Optional[datetime.timedelta]¶
-
property
top_left
¶
-
property
bottom_right
¶
-
-
class
depthai_sdk.classes.nn_results.
TrackingDetection
(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType], tracklet: depthai.Tracklet, filtered_2d: depthai_sdk.visualize.bbox.BoundingBox, filtered_3d: depthai.Point3f, speed: Union[float, NoneType])¶ -
tracklet
: depthai.Tracklet¶
-
filtered_2d
: depthai_sdk.visualize.bbox.BoundingBox¶
-
filtered_3d
: depthai.Point3f¶
-
property
speed_kmph
¶
-
property
speed_mph
¶
-
-
class
depthai_sdk.classes.nn_results.
TwoStageDetection
(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType], nn_data: depthai.NNData)¶ -
nn_data
: depthai.NNData¶
-
-
class
depthai_sdk.classes.nn_results.
GenericNNOutput
(nn_data)¶ Generic NN output, to be used for higher-level abstractions (eg. automatic visualization of results).
-
getTimestamp
()¶ - Return type
-
-
class
depthai_sdk.classes.nn_results.
Detections
(nn_data, is_rotated=False)¶ Detection results containing bounding boxes, labels and confidences. Optionally can contain rotation angles.
-
class
depthai_sdk.classes.nn_results.
SemanticSegmentation
(nn_data, mask)¶ Semantic segmentation results, with a mask for each class.
Examples: DeeplabV3, Lanenet, road-segmentation-adas-0001.
-
mask
: List[numpy.ndarray]¶
-
-
class
depthai_sdk.classes.nn_results.
ImgLandmarks
(nn_data, landmarks=None, landmarks_indices=None, pairs=None, colors=None)¶ Landmarks results, with a list of landmarks and pairs of landmarks to draw lines between.
Examples: human-pose-estimation-0001, openpose2, facial-landmarks-68, landmarks-regression-retail-0009.
-
class
depthai_sdk.classes.nn_results.
InstanceSegmentation
(nn_data, masks, labels)¶ Instance segmentation results, with a mask for each instance.
-
masks
: List[numpy.ndarray]¶
-