API reference
0.3.5
package
luxonis_train
package
package
package
package
package
package
package
package
package
module
registry
This module implements a metaclass for automatic registration of classes.
package
package
module
module
package
variable
package
luxonis_train.assigners
module
luxonis_train.assigners.utils
function
candidates_in_gt(anchor_centers: Tensor, gt_bboxes: Tensor, eps: float = 1e-09) -> Tensor: Tensor
Check if anchor box's center is in any GT bbox. @type anchor_centers: Tensor @param anchor_centers: Centers of anchor bboxes [n_anchors, 2] @type gt_bboxes: Tensor @param gt_bboxes: Ground truth bboxes [bs * n_max_boxes, 4] @type eps: float @param eps: Threshold for minimum delta. Defaults to 1e-9. @rtype: Tensor @return: Mask for anchors inside any GT bbox
function
fix_collisions(mask_pos: Tensor, overlaps: Tensor, n_max_boxes: int) -> tuple[Tensor, Tensor, Tensor]: tuple[Tensor, Tensor, Tensor]
If an anchor is assigned to multiple GTs, the one with highest IoU is selected. @type mask_pos: Tensor @param mask_pos: Mask of assigned anchors [bs, n_max_boxes, n_anchors] @type overlaps: Tensor @param overlaps: IoUs between GTs and anchors [bx, n_max_boxes, n_anchors] @type n_max_boxes: int @param n_max_boxes: Number of maximum boxes per image @rtype: tuple[Tensor, Tensor, Tensor] @return: Assigned indices, sum of positive mask, positive mask
function
batch_iou(batch1: Tensor, batch2: Tensor) -> Tensor: Tensor
Calculates IoU for each pair of bounding boxes in the batch. Bounding boxes must be in the "xyxy" format. @type batch1: Tensor @param batch1: Tensor of shape C{[bs, N, 4]} @type batch2: Tensor @param batch2: Tensor of shape C{[bs, M, 4]} @rtype: Tensor @return: Per image box IoU of shape C{[bs, N, M]}
class
luxonis_train.assigners.ATSSAssigner(torch.nn.Module)
method
__init__(self, n_classes: int, topk: int = 9)
Adaptive Training Sample Selection Assigner, adapted from U{Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection<https://arxiv.org/pdf/1912.02424.pdf>}. Code is adapted from: U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ ppyoloe/assigner/atss_assigner.py} and U{https://github.com/fcjian/TOOD/blob/master/mmdet/core/bbox/ assigners/atss_assigner.py} @type n_classes: int @param n_classes: Number of classes in the dataset. @type topk: int @param topk: Number of anchors considere in selection. Defaults to 9.
variable
variable
method
forward(self, anchor_bboxes: Tensor, n_level_bboxes: list
[
int
], gt_labels: Tensor, gt_bboxes: Tensor, mask_gt: Tensor, pred_bboxes: Tensor) -> tuple[Tensor, Tensor, Tensor, Tensor, Tensor]: tuple[Tensor, Tensor, Tensor, Tensor, Tensor]
Assigner's forward method which generates final assignments. @type anchor_bboxes: Tensor @param anchor_bboxes: Anchor bboxes of shape [n_anchors, 4] @type n_level_bboxes: list[int] @param n_level_bboxes: Number of bboxes per level @type gt_labels: Tensor @param gt_labels: Initial GT labels [bs, n_max_boxes, 1] @type gt_bboxes: Tensor @param gt_bboxes: Initial GT bboxes [bs, n_max_boxes, 4] @type mask_gt: Tensor @param mask_gt: Mask for valid GTs [bs, n_max_boxes, 1] @type pred_bboxes: Tensor @param pred_bboxes: Predicted bboxes of shape [bs, n_anchors, 4] @rtype: tuple[Tensor, Tensor, Tensor, Tensor, Tensor] @return: Assigned labels of shape [bs, n_anchors], assigned bboxes of shape [bs, n_anchors, 4], assigned scores of shape [bs, n_anchors, n_classes] and output positive mask of shape [bs, n_anchors].
variable
variable
variable
class
luxonis_train.assigners.TaskAlignedAssigner(torch.nn.Module)
method
__init__(self, n_classes: int, topk: int = 13, alpha: float = 1.0, beta: float = 6.0, eps: float = 1e-09)
Task Aligned Assigner. Adapted from: U{TOOD: Task-aligned One-stage Object Detection<https://arxiv.org/pdf/2108.07755.pdf>}. Code is adapted from: U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/assigner/tal_assigner.py}. @license: U{Apache License, Version 2.0<https://github.com/Nioolek/PPYOLOE_pytorch/ tree/master?tab=Apache-2.0-1-ov-file#readme>} @type n_classes: int @param n_classes: Number of classes in the dataset. @type topk: int @param topk: Number of anchors considered in selection. Defaults to 13. @type alpha: float @param alpha: Defaults to 1.0. @type beta: float @param beta: Defaults to 6.0. @type eps: float @param eps: Defaults to 1e-9.
variable
variable
variable
variable
variable
method
forward(self, pred_scores: Tensor, pred_bboxes: Tensor, anchor_points: Tensor, gt_labels: Tensor, gt_bboxes: Tensor, mask_gt: Tensor, pred_kpts: Tensor
|
None = None, gt_kpts: Tensor
|
None = None, sigmas: Tensor
|
None = None, area_factor: float
|
None = None) -> tuple[Tensor, Tensor, Tensor, Tensor, Tensor]: tuple[Tensor, Tensor, Tensor, Tensor, Tensor]
Assigner's forward method which generates final assignments. If both pred_kpts and gt_kpts are provided, a pose OKS is computed and used in the alignment metric; the final tuple then includes assigned poses. @type pred_scores: Tensor @param pred_scores: Predicted scores [bs, n_anchors, 1] @type pred_bboxes: Tensor @param pred_bboxes: Predicted bboxes [bs, n_anchors, 4] @type anchor_points: Tensor @param anchor_points: Anchor points [n_anchors, 2] @type gt_labels: Tensor @param gt_labels: Initial GT labels [bs, n_max_boxes, 1] @type gt_bboxes: Tensor @param gt_bboxes: Initial GT bboxes [bs, n_max_boxes, 4] @type mask_gt: Tensor @param mask_gt: Mask for valid GTs [bs, n_max_boxes, 1] @type pred_kpts: Tensor | None @param pred_kpts: Predicted keypoints [bs, n_anchors, n_kpts, 3] (optional) @type gt_kpts: Tensor | None @param gt_kpts: Ground truth keypoints [bs, n_max_boxes, n_kpts, 3] (optional) @type sigmas: Tensor | None @param sigmas: Sigmas for OKS computation if keypoints are used. @type area_factor: float | None @param area_factor: Area factor for OKS computation. Defaults to 0.53. @rtype: tuple[Tensor, Tensor, Tensor, Tensor, Tensor] @return: Assigned labels of shape [bs, n_anchors], assigned bboxes of shape [bs, n_anchors, 4], assigned scores of shape [bs, n_anchors, n_classes] and output mask of shape [bs, n_anchors]
variable
variable
package
luxonis_train.attached_modules
module
luxonis_train.attached_modules.base_attached_module
class
BaseAttachedModule
Base class for all modules that are attached to a LuxonisNode. Attached modules include losses, metrics and visualizers. This class contains a default implementation of prepare method, which should be sufficient for most simple cases. More complex modules should override the prepare method. When subclassing, the following methods can be overridden: prepare: Prepares node outputs for the forward pass of the module. Override this method if the default implementation is not sufficient. Additionally, the following attributes can be overridden: supported_tasks: List of task types that the module supports. Used to determine which labels to extract from the dataset and to validate compatibility with the node based on the node's tasks.
class
luxonis_train.attached_modules.base_attached_module.BaseAttachedModule(torch.nn.Module, abc.ABC)
variable
supported_tasks
List of task types that the module supports. Elements of the list can be either a single task type or a tuple of task types. In case of the latter, the module requires all of the specified labels in the tuple to be present.
method
__init__(self, node: BaseNode
|
None = None, kwargs)
Constructor for teh C{BaseAttachedModule} @type node: BaseNode @param node: Reference to the node that this module is attached to. @param kwargs: Additional keyword arguments.
property
property
property
property
property
node
Reference to the node that this module is attached to.
property
n_keypoints
Getter for the number of keypoints.
property
n_classes
Getter for the number of classes.
property
original_in_shape
Getter for the original input shape as [N, H, W].
property
classes
Getter for the class mapping.
method
package
luxonis_train.attached_modules.losses
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
module
class
class
BaseLoss
A base class for all loss functions. This class defines the basic interface for all loss functions. It utilizes automatic registration of defined subclasses to a LOSSES registry.
class
class
CrossEntropyLoss
This criterion computes the cross entropy loss between input logits and target.
class
CTCLoss
CTC loss with optional focal loss weighting.
class
class
class
class
OHEMBCEWithLogitsLoss
This criterion computes the binary cross entropy loss between input logits and target with OHEM (Online Hard Example Mining).
class
OHEMCrossEntropyLoss
This criterion computes the cross entropy loss between input logits and target with OHEM (Online Hard Example Mining).
class
OHEMLoss
Generic OHEM loss that can be used with different criterions.
class
class
class
class
class
class
module
luxonis_train.attached_modules.losses.adaptive_detection_loss
class
class
luxonis_train.attached_modules.losses.adaptive_detection_loss.VarifocalLoss(torch.nn.Module)
method
__init__(self, alpha: float = 0.75, gamma: float = 2.0, per_class_weights: Tensor
|
None = None)
Varifocal Loss is a loss function for training a dense object detector to predict the IoU-aware classification score, inspired by focal loss. Code is adapted from: U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/models/losses.py} @type alpha: float @param alpha: alpha parameter in focal loss, default is 0.75. @type gamma: float @param gamma: gamma parameter in focal loss, default is 2.0. @type per_class_weights: Tensor | None @param per_class_weights: A list of weights to scale the loss for each class during training. This allows you to emphasize or de-emphasize certain classes based on their importance or representation in the dataset. The weights' length must be equal to the number of classes.
variable
variable
variable
method
module
luxonis_train.attached_modules.losses.embedding_losses
constant
class
luxonis_train.attached_modules.losses.precision_dfl_detection_loss.BBoxLoss(torch.nn.Module)
method
__init__(self, reg_max: int = 16)
BBox loss that combines IoU and DFL losses. @type reg_max: int @param reg_max: Maximum number of regression channels. Defaults to 16.
variable
method
class
luxonis_train.attached_modules.losses.precision_dfl_detection_loss.DFLoss(torch.nn.Module)
method
__init__(self, reg_max: int = 16)
DFL loss that combines classification and regression losses. @type reg_max: int @param reg_max: Maximum number of regression channels. Defaults to 16.
variable
method
module
luxonis_train.attached_modules.losses.reconstruction_segmentation_loss
class
function
function
function
class
luxonis_train.attached_modules.losses.reconstruction_segmentation_loss.SSIM(torch.nn.Module)
method
variable
variable
variable
variable
variable
method
class
luxonis_train.attached_modules.losses.AdaptiveDetectionLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
variable
variable
variable
variable
variable
variable
method
__init__(self, n_warmup_epochs: int = 0, iou_type: IoUType = 'giou', reduction: Literal
[
'
sum
'
,
'
mean
'
] = 'mean', class_loss_weight: float = 1.0, iou_loss_weight: float = 2.5, per_class_weights: list
[
float
]
|
None = None, kwargs)
BBox loss adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. It combines IoU based bbox regression loss and varifocal loss for classification. Code is adapted from U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/models}. @type n_warmup_epochs: int @param n_warmup_epochs: Number of epochs where ATSS assigner is used, after that we switch to TAL assigner. @type iou_type: L{IoUType} @param iou_type: IoU type used for bbox regression loss. @type reduction: Literal["sum", "mean"] @param reduction: Reduction type for loss. @type class_loss_weight: float @param class_loss_weight: Weight of classification loss. Defaults to 1.0. For optimal results, multiply with accumulate_grad_batches. @type iou_loss_weight: float @param iou_loss_weight: Weight of IoU loss. Defaults to 2.5. For optimal results, multiply with accumulate_grad_batches. @type per_class_weights: list[float] | None @param per_class_weights: A list of weights to scale the loss for each class during training. This allows you to emphasize or de-emphasize certain classes based on their importance or representation in the dataset. The weights' length must be equal to the number of classes.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
variable
class
luxonis_train.attached_modules.losses.BaseLoss(luxonis_train.attached_modules.BaseAttachedModule)
method
forward(self, args: Tensor
|
list
[
Tensor
]) -> Tensor|tuple[Tensor, dict[str, Tensor]]: Tensor|tuple[Tensor, dict[str, Tensor]]
Forward pass of the loss function. @type args: Unpack[Ts] @param args: Prepared inputs from the L{prepare} method. @rtype: Tensor | tuple[Tensor, dict[str, Tensor]] @return: The main loss and optional a dictionary of sub-losses (for logging). Only the main loss is used for backpropagation.
method
run(self, inputs: Packet
[
Tensor
], labels: Labels) -> Tensor|tuple[Tensor, dict[str, Tensor]]: Tensor|tuple[Tensor, dict[str, Tensor]]
Calls the loss function. Validates and prepares the inputs, then calls the loss function. @type inputs: Packet[Tensor] @param inputs: Outputs from the node. @type labels: L{Labels} @param labels: Labels from the dataset. @rtype: Tensor | tuple[Tensor, dict[str, Tensor]] @return: The main loss and optional a dictionary of sub-losses (for logging). Only the main loss is used for backpropagation. @raises IncompatibleError: If the inputs are not compatible with the module.
class
luxonis_train.attached_modules.losses.BCEWithLogitsLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, weight: list
[
float
]
|
None = None, reduction: Literal
[
'
none
'
,
'
mean
'
,
'
sum
'
] = 'mean', pos_weight: Tensor
|
None = None, kwargs)
This loss combines a L{nn.Sigmoid} layer and the L{nn.BCELoss} in one single class. This version is more numerically stable than using a plain C{Sigmoid} followed by a {BCELoss} as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability. @type weight: list[float] | None @param weight: a manual rescaling weight given to the loss of each batch element. If given, has to be a list of length C{nbatch}. Defaults to C{None}. @type reduction: Literal["none", "mean", "sum"] @param reduction: Specifies the reduction to apply to the output: C{"none"} | C{"mean"} | C{"sum"}. C{"none"}: no reduction will be applied, C{"mean"}: the sum of the output will be divided by the number of elements in the output, C{"sum"}: the output will be summed. Note: C{size_average} and C{reduce} are in the process of being deprecated, and in the meantime, specifying either of those two args will override C{reduction}. Defaults to C{"mean"}. @type pos_weight: Tensor | None @param pos_weight: a weight of positive examples to be broadcasted with target. Must be a tensor with equal size along the class dimension to the number of classes. Pay close attention to PyTorch's broadcasting semantics in order to achieve the desired operations. For a target of size [B, C, H, W] (where B is batch size) pos_weight of size [B, C, H, W] will apply different pos_weights to each element of the batch or [C, H, W] the same pos_weights across the batch. To apply the same positive weight along all spacial dimensions for a 2D multi-class target [C, H, W] use: [C, 1, 1]. Defaults to C{None}.
variable
method
forward(self, predictions: Tensor, target: Tensor) -> Tensor: Tensor
Computes the BCE loss from logits. @type predictions: Tensor @param predictions: Network predictions of shape (N, C, ...) @type target: Tensor @param target: A tensor of the same shape as predictions. @rtype: Tensor @return: A scalar tensor.
class
luxonis_train.attached_modules.losses.CrossEntropyLoss(luxonis_train.attached_modules.losses.BaseLoss)
class
luxonis_train.attached_modules.losses.CTCLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, use_focal_loss: bool = True, kwargs)
Initializes the CTC loss with optional focal loss support. @type use_focal_loss: bool @param use_focal_loss: Whether to apply focal loss weighting to the CTC loss. Defaults to True.
variable
variable
method
forward(self, predictions: Tensor, target: Tensor) -> Tensor: Tensor
Computes the CTC loss, optionally applying focal loss. @type preds: Tensor @param preds: Network predictions of shape (B, T, C), where T is the sequence length, B is the batch size, and C is the number of classes. @type targets: Tensor @param targets: Encoded target sequences. @rtype: Tensor @return: The computed loss as a scalar tensor.
class
luxonis_train.attached_modules.losses.EfficientKeypointBBoxLoss(luxonis_train.attached_modules.losses.AdaptiveDetectionLoss)
variable
variable
variable
method
__init__(self, n_warmup_epochs: int = 0, iou_type: IoUType = 'giou', reduction: Literal
[
'
sum
'
,
'
mean
'
] = 'mean', class_loss_weight: float = 0.5, iou_loss_weight: float = 7.5, viz_pw: float = 1.0, regr_kpts_loss_weight: float = 12, vis_kpts_loss_weight: float = 1.0, sigmas: list
[
float
]
|
None = None, area_factor: float
|
None = None, kwargs)
BBox loss adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. It combines IoU based bbox regression loss and varifocal loss for classification. Code is adapted from U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/models}. @type n_warmup_epochs: int @param n_warmup_epochs: Number of epochs where ATSS assigner is used, after that we switch to TAL assigner. @type iou_type: Literal["none", "giou", "diou", "ciou", "siou"] @param iou_type: IoU type used for bbox regression loss. @type reduction: Literal["sum", "mean"] @param reduction: Reduction type for loss. @type class_loss_weight: float @param class_loss_weight: Weight of classification loss for bounding boxes. @type regr_kpts_loss_weight: float @param regr_kpts_loss_weight: Weight of regression loss for keypoints. Defaults to 12.0. For optimal results, multiply with accumulate_grad_batches. @type vis_kpts_loss_weight: float @param vis_kpts_loss_weight: Weight of visibility loss for keypoints. Defaults to 1.0. For optimal results, multiply with accumulate_grad_batches. @type iou_loss_weight: float @param iou_loss_weight: Weight of IoU loss. Defaults to 2.5. For optimal results, multiply with accumulate_grad_batches. @type sigmas: list[float] | None @param sigmas: Sigmas used in keypoint loss for OKS metric. If None then use COCO ones if possible or default ones. Defaults to C{None}. @type area_factor: float | None @param area_factor: Factor by which we multiply bounding box area which is used in the keypoint loss. If not set, the default factor of `0.53` is used.
variable
variable
variable
variable
variable
method
method
dist2kpts_noscale(self, anchor_points: Tensor, kpts: Tensor) -> Tensor: Tensor
Adjusts and scales predicted keypoints relative to anchor points without considering image stride.
class
luxonis_train.attached_modules.losses.EmbeddingLossWrapper(luxonis_train.attached_modules.losses.BaseLoss)
variable
variable
variable
method
variable
method
property
class
luxonis_train.attached_modules.losses.FOMOLocalizationLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
variable
method
__init__(self, object_weight: float = 500, alpha = 0.45, gamma = 2, kwargs)
FOMO Localization Loss for object detection using heatmaps. @type object_weight: float @param object_weight: Weight multiplier for keypoint pixels in loss calculation. Typical values range from 100-1000 depending on keypoint sparsity. @type alpha: float @param alpha: Focal loss alpha parameter for class balance (0-1 range). Lower values reduce positive example weighting. @type gamma: float @param gamma: Focal loss gamma parameter for hard example focusing (γ >= 0). Higher values focus more on hard misclassified examples.
variable
variable
variable
variable
method
class
luxonis_train.attached_modules.losses.OHEMBCEWithLogitsLoss(luxonis_train.attached_modules.losses.OHEMLoss)
method
class
luxonis_train.attached_modules.losses.OHEMCrossEntropyLoss(luxonis_train.attached_modules.losses.OHEMLoss)
method
class
luxonis_train.attached_modules.losses.OHEMLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, criterion: type
[
BaseLoss
], ohem_ratio: float = 0.1, ohem_threshold: float = 0.7, kwargs)
Initializes the criterion. @type criterion: BaseLoss @param criterion: The criterion to use. @type ohem_ratio: float @param ohem_ratio: The ratio of pixels to keep. @type ohem_threshold: float @param ohem_threshold: The threshold for pixels to keep. @param kwargs: Additional keyword arguments that are passed to the criterion.
variable
variable
variable
method
class
luxonis_train.attached_modules.losses.PrecisionDFLDetectionLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
variable
method
__init__(self, tal_topk: int = 10, class_loss_weight: float = 0.5, bbox_loss_weight: float = 7.5, dfl_loss_weight: float = 1.5, kwargs)
BBox loss adapted from U{Real-Time Flying Object Detection with YOLOv8 <https://arxiv.org/pdf/2305.09972>} and from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. Code is adapted from U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/models}. @type tal_topk: int @param tal_topk: Number of anchors considered in selection. Defaults to 10. @type class_loss_weight: float @param class_loss_weight: Weight for classification loss. Defaults to 0.5. For optimal results, multiply with accumulate_grad_batches. @type bbox_loss_weight: float @param bbox_loss_weight: Weight for bbox loss. Defaults to 7.5. For optimal results, multiply with accumulate_grad_batches. @type dfl_loss_weight: float @param dfl_loss_weight: Weight for DFL loss. Defaults to 1.5. For optimal results, multiply with accumulate_grad_batches.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
decode_bbox(self, anchor_points: Tensor, pred_dist: Tensor) -> Tensor: Tensor
Decode predicted object bounding box coordinates from anchor points and distribution. @type anchor_points: Tensor @param anchor_points: Anchor points tensor of shape [N, 4] where N is the number of anchors. @type pred_dist: Tensor @param pred_dist: Predicted distribution tensor of shape [batch_size, N, 4 * reg_max] where N is the number of anchors. @rtype: Tensor
variable
variable
variable
variable
class
luxonis_train.attached_modules.losses.PrecisionDFLSegmentationLoss(luxonis_train.attached_modules.losses.PrecisionDFLDetectionLoss)
variable
variable
method
__init__(self, tal_topk: int = 10, class_loss_weight: float = 0.5, bbox_loss_weight: float = 7.5, dfl_loss_weight: float = 1.5, kwargs)
Instance Segmentation and BBox loss adapted from U{Real-Time Flying Object Detection with YOLOv8 <https://arxiv.org/pdf/2305.09972>} and from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. Code is adapted from U{https://github.com/Nioolek/PPYOLOE_pytorch/blob/master/ppyoloe/models}. @type tal_topk: int @param tal_topk: Number of anchors considered in selection. Defaults to 10. @type class_loss_weight: float @param class_loss_weight: Weight for classification loss. Defaults to 0.5. For optimal results, multiply with accumulate_grad_batches. @type bbox_loss_weight: float @param bbox_loss_weight: Weight for bbox loss. Defaults to 7.5. For optimal results, multiply with accumulate_grad_batches. @type dfl_loss_weight: float @param dfl_loss_weight: Weight for DFL loss. Defaults to 1.5. For optimal results, multiply with accumulate_grad_batches.
method
method
compute_segmentation_loss(self, fg_mask: Tensor, gt_masks: Tensor, gt_idx: Tensor, bboxes: Tensor, batch_ids: Tensor, proto: Tensor, pred_masks: Tensor) -> Tensor: Tensor
Compute the segmentation loss for the entire batch. @type fg_mask: Tensor @param fg_mask: Foreground mask. Shape: (B, N_anchor). @type gt_masks: Tensor @param gt_masks: Ground truth masks. Shape: (n, H, W). @type gt_idx: Tensor @param gt_idx: Ground truth mask indices. Shape: (B, N_anchor). @type bboxes: Tensor @param bboxes: Ground truth bounding boxes in xyxy format. Shape: (B, N_anchor, 4). @type batch_ids: Tensor @param batch_ids: Batch indices. Shape: (n, 1). @type proto: Tensor @param proto: Prototype masks. Shape: (B, 32, H, W). @type pred_masks: Tensor @param pred_masks: Predicted mask coefficients. Shape: (B, N_anchor, 32).
class
luxonis_train.attached_modules.losses.ReconstructionSegmentationLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
variable
method
__init__(self, alpha: float = 1, gamma: float = 2.0, reduction: Literal
[
'
none
'
,
'
mean
'
,
'
sum
'
] = 'mean', smooth: float = 1e-05, kwargs)
ReconstructionSegmentationLoss implements a combined loss function for reconstruction and segmentation tasks. It combines L2 loss for reconstruction, SSIM loss, and Focal loss for segmentation. @type alpha: float @param alpha: Weighting factor for the rare class in the focal loss. Defaults to C{1}. @type gamma: float @param gamma: Focusing parameter for the focal loss. Defaults to C{2.0}. @type smooth: float @param smooth: Label smoothing factor for the focal loss. Defaults to C{0.0}. @type reduction: Literal["none", "mean", "sum"] @param reduction: Reduction type for the focal loss.. Defaults to C{"mean"}.
variable
variable
variable
method
class
luxonis_train.attached_modules.losses.SigmoidFocalLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, alpha: float = 0.25, gamma: float = 2.0, reduction: Literal
[
'
none
'
,
'
mean
'
,
'
sum
'
] = 'mean', kwargs)
Focal loss from U{Focal Loss for Dense Object Detection <https://arxiv.org/abs/1708.02002>}. @type alpha: float @param alpha: Weighting factor in range (0,1) to balance positive vs negative examples or -1 for ignore. Defaults to C{0.25}. @type gamma: float @param gamma: Exponent of the modulating factor (1 - p_t) to balance easy vs hard examples. Defaults to C{2.0}. @type reduction: Literal["none", "mean", "sum"] @param reduction: Reduction type for loss. Defaults to C{"mean"}.
variable
variable
variable
method
class
luxonis_train.attached_modules.losses.SmoothBCEWithLogitsLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, label_smoothing: float = 0.0, bce_pow: float = 1.0, weight: list
[
float
]
|
None = None, reduction: Literal
[
'
mean
'
,
'
sum
'
,
'
none
'
] = 'mean', kwargs)
BCE with logits loss and label smoothing. @type label_smoothing: float @param label_smoothing: Label smoothing factor. Defaults to C{0.0}. @type bce_pow: float @param bce_pow: Weight for positive samples. Defaults to C{1.0}. @type weight: list[float] | None @param weight: a manual rescaling weight given to the loss of each batch element. If given, it has to be a list of length C{nbatch}. @type reduction: Literal["mean", "sum", "none"] @param reduction: Specifies the reduction to apply to the output: C{'none'} | C{'mean'} | C{'sum'}. C{'none'}: no reduction will be applied, C{'mean'}: the sum of the output will be divided by the number of elements in the output, C{'sum'}: the output will be summed. Note: C{size_average} and C{reduce} are in the process of being deprecated, and in the meantime, specifying either of those two args will override C{reduction}. Defaults to C{'mean'}.
variable
variable
variable
method
forward(self, predictions: Tensor, target: Tensor) -> Tensor: Tensor
Computes the BCE loss with label smoothing. @type predictions: Tensor @param predictions: Network predictions of shape (N, C, ...) @type target: Tensor @param target: A tensor of the same shape as predictions. @rtype: Tensor @return: A scalar tensor.
class
luxonis_train.attached_modules.losses.SoftmaxFocalLoss(luxonis_train.attached_modules.losses.BaseLoss)
variable
method
__init__(self, alpha: float
|
list
[
float
] = 0.25, gamma: float = 2.0, smooth: float = 0.0, reduction: Literal
[
'
none
'
,
'
mean
'
,
'
sum
'
] = 'mean', kwargs)
Focal loss implementation for classification and segmentation tasks using Softmax. @type alpha: float | list[float] @param alpha: Weighting factor for the rare class. Defaults to C{0.25}. @type gamma: float @param gamma: Focusing parameter. Defaults to C{2.0}. @type smooth: float @param smooth: Label smoothing factor. Defaults to C{0.0}. @type reduction: Literal["none", "mean", "sum"] @param reduction: Reduction type. Defaults to C{"mean"}.
variable
variable
variable
variable
method
package
luxonis_train.attached_modules.metrics
module
package
module
module
package
module
module
module
module
module
class
BaseMetric
A base class for all metrics. This class defines the basic interface for all metrics. It utilizes automatic registration of defined subclasses to a METRICS registry.
class
MetricState
Marks an attribute that should be registered as a metric state. Intended to be used as a type hint for class attributes using the `Annotated` type. Upon initialization of a metric, all attributes of the metric that are marked as metric states will be registered using the `add_state` method. The state will be accessible as an attribute of the metric instance. Metric state variables are either Tensor or an empty list, which can be appended to by the metric. Metric states behave like buffers and parameters of nn.Module as they are also updated when .to() is called. Unlike parameters and buffers, metric states are not by default saved in the modules nn.Module.state_dict. The metric state variables are automatically reset to their default values when the metric's reset() method is called. Example usage: class MyMetric(BaseMetric): true_positives: Annotated[Tensor, MetricState(default=0)] false_positives: Annotated[Tensor, MetricState(default=0)] total: Annotated[Tensor, MetricState(default=0)]
class
DiceCoefficient
Dice coefficient metric for SEGMENTATION tasks.
class
class
class
MIoU
Mean IoU metric for SEGMENTATION tasks.
class
class
OCRAccuracy
Accuracy metric for OCR tasks.
class
class
class
class
class
function
fix_empty_tensor(tensor: Tensor) -> Tensor: Tensor
Empty tensors can cause problems in DDP mode, this methods corrects them.
function
merge_bbox_kpt_targets(target_boundingbox: Tensor, target_keypoints: Tensor, device: torch.device
|
None = None) -> Tensor: Tensor
Merges the bounding box and keypoint targets into a single tensor. @param target_boundingbox: The bounding box targets. @param target_keypoints: The keypoint targets. @param device: The device to use.
package
luxonis_train.attached_modules.metrics.confusion_matrix
module
module
module
module
module
module
class
ConfusionMatrix
Factory class for Confusion Matrix metrics. Creates the appropriate Confusion Matrix based on the task of the node.
class
class
class
class
RecognitionConfusionMatrix
Factory class for Recognition Confusion Matrix metrics. Creates the appropriate confusion matrix metric based on the number of classes of the node.
module
luxonis_train.attached_modules.metrics.confusion_matrix.utils
function
preprocess_instance_masks(predicted_boundingbox: list
[
Tensor
], predicted_instance_segmentation: list
[
Tensor
], target_boundingbox: Tensor, target_instance_segmentation: Tensor, n_classes: int, height: int, width: int, device: torch.device) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Turns an instance segmentation mask into a semantic one by merging the masks of the same class.
function
compute_mcc(cm: Tensor) -> Tensor: Tensor
" Compute the Matthews correlation coefficient from a confusion matrix. @type cm: Tensor @param cm: Confusion matrix. @rtype: Tensor @return: Matthews correlation coefficient.
class
luxonis_train.attached_modules.metrics.confusion_matrix.ConfusionMatrix
class
luxonis_train.attached_modules.metrics.confusion_matrix.DetectionConfusionMatrix(luxonis_train.attached_modules.metrics.BaseMetric)
class
luxonis_train.attached_modules.metrics.confusion_matrix.FomoConfusionMatrix(luxonis_train.attached_modules.metrics.confusion_matrix.DetectionConfusionMatrix)
variable
method
method
update(self, keypoints: list
[
Tensor
], target_boundingbox: Tensor)
Override update to convert FOMO keypoints into bounding boxes before calling the parent update method.
class
luxonis_train.attached_modules.metrics.confusion_matrix.InstanceSegmentationConfusionMatrix(luxonis_train.attached_modules.metrics.confusion_matrix.DetectionConfusionMatrix, luxonis_train.attached_modules.metrics.confusion_matrix.RecognitionConfusionMatrix)
class
luxonis_train.attached_modules.metrics.confusion_matrix.RecognitionConfusionMatrix(luxonis_train.attached_modules.metrics.BaseMetric)
variable
method
variable
method
method
method
module
luxonis_train.attached_modules.metrics.embedding_metrics
package
luxonis_train.attached_modules.metrics.mean_average_precision
module
module
module
module
module
class
MeanAveragePrecision
Factory class for Mean Average Precision (mAP) metrics. Creates the appropriate mAP metric based on the task of the node.
class
class
MeanAveragePrecisionKeypoints
Mean Average Precision metric for keypoints. Uses OKS as IoU measure.
class
module
luxonis_train.attached_modules.metrics.mean_average_precision.utils
function
function
function
function
class
luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecision
class
luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecisionBBox(luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecision, luxonis_train.attached_modules.metrics.BaseMetric)
class
luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecisionKeypoints(luxonis_train.attached_modules.metrics.BaseMetric)
variable
variable
variable
variable
variable
variable
variable
variable
method
__init__(self, sigmas: list
[
float
]
|
None = None, area_factor: float
|
None = None, max_dets: int = 20, box_format: Literal
[
'
xyxy
'
,
'
xywh
'
,
'
cxcywh
'
] = 'xyxy', kwargs)
Implementation of the mean average precision metric for keypoint detections. Adapted from: U{https://github.com/Lightning-AI/torchmetrics/blob/v1.0.1/src/ torchmetrics/detection/mean_ap.py}. @license: Apache License, Version 2.0 @type sigmas: list[float] | None @param sigmas: Sigma for each keypoint to weigh its importance, if C{None}, then use COCO if possible otherwise defaults. Defaults to C{None}. @type area_factor: float | None @param area_factor: Factor by which we multiply the bounding box area. If not set, the default factor of C{0.53} is used. @type max_dets: int, @param max_dets: Maximum number of detections to be considered per image. Defaults to C{20}. @type box_format: Literal["xyxy", "xywh", "cxcywh"] @param box_format: Input bounding box format. Defaults to C{"xyxy"}.
variable
variable
variable
variable
method
method
compute(self) -> tuple[Tensor, dict[str, Tensor]]: tuple[Tensor, dict[str, Tensor]]
Torchmetric compute function.
variable
class
luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecisionSegmentation(luxonis_train.attached_modules.metrics.mean_average_precision.MeanAveragePrecision, luxonis_train.attached_modules.metrics.BaseMetric)
module
luxonis_train.attached_modules.metrics.torchmetrics
class
class
luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper(luxonis_train.attached_modules.metrics.base_metric.BaseMetric)
variable
method
variable
method
method
method
property
class
luxonis_train.attached_modules.metrics.BaseMetric(luxonis_train.attached_modules.BaseAttachedModule, torchmetrics.Metric)
method
method
update(self, args: Tensor
|
list
[
Tensor
])
Updates the inner state of the metric. @type args: Unpack[Ts] @param args: Prepared inputs from the L{prepare} method.
method
compute(self) -> Tensor|tuple[Tensor, dict[str, Tensor]]|dict[str, Tensor]: Tensor|tuple[Tensor, dict[str, Tensor]]|dict[str, Tensor]
Computes the metric. @rtype: Tensor | tuple[Tensor, dict[str, Tensor]] | dict[str, Tensor] @return: The computed metric. Can be one of: - A single Tensor. - A tuple of a Tensor and a dictionary of sub-metrics. - A dictionary of sub-metrics. If this is the case, then the metric cannot be used as the main metric of the model.
method
run_update(self, inputs: Packet
[
Tensor
], labels: Labels)
Calls the metric's update method. Validates and prepares the inputs, then calls the metric's update method. @type inputs: Packet[Tensor] @param inputs: The outputs of the model. @type labels: Labels @param labels: The labels of the model. @raises L{IncompatibleError}: If the inputs are not compatible with the module.
class
luxonis_train.attached_modules.metrics.MetricState
class
luxonis_train.attached_modules.metrics.DiceCoefficient(luxonis_train.attached_modules.metrics.BaseMetric)
variable
method
__init__(self, num_classes: int, include_background: bool = True, average: Literal
[
'
micro
'
,
'
macro
'
,
'
weighted
'
,
'
none
'
] = 'micro', input_format: Literal
[
'
one-hot
'
,
'
index
'
] = 'index', kwargs)
Initializes the Dice coefficient metric. @type num_classes: int @param num_classes: Number of classes. @type include_background: bool @param include_background: Whether to include the background class. @type average: Literal["micro", "macro", "weighted", "none"] @param average: Type of averaging. @type input_format: Literal["one-hot", "index"] @param input_format: Format of the input.
variable
variable
method
method
method
method
class
luxonis_train.attached_modules.metrics.ClosestIsPositiveAccuracy(luxonis_train.attached_modules.metrics.BaseMetric)
variable
variable
variable
variable
variable
method
variable
method
method
class
luxonis_train.attached_modules.metrics.MedianDistances(luxonis_train.attached_modules.metrics.BaseMetric)
variable
variable
variable
variable
variable
variable
variable
method
variable
method
method
class
luxonis_train.attached_modules.metrics.MIoU(luxonis_train.attached_modules.metrics.BaseMetric)
variable
method
__init__(self, num_classes: int, include_background: bool = True, per_class: bool = False, input_format: Literal
[
'
one-hot
'
,
'
index
'
] = 'index', kwargs)
Initializes the mean IoU metric. @type num_classes: int @param num_classes: Number of classes. @type include_background: bool @param include_background: Whether to include the background class. @type per_class: bool @param per_class: Whether to compute the IoU per class. @type input_format: Literal["one-hot", "index"] @param input_format: Format of the input.
variable
variable
method
method
method
method
class
luxonis_train.attached_modules.metrics.ObjectKeypointSimilarity(luxonis_train.attached_modules.metrics.BaseMetric)
variable
variable
variable
variable
method
__init__(self, sigmas: list
[
float
]
|
None = None, area_factor: float
|
None = None, use_cocoeval_oks: bool = True, kwargs)
Object Keypoint Similarity metric for evaluating keypoint predictions. @type sigmas: list[float] | None @param sigmas: Sigma for each keypoint to weigh its importance, if C{None}, then use COCO if possible otherwise defaults. Defaults to C{None}. @type area_factor: float | None @param area_factor: Factor by which we multiply the bounding box area. If not set, the default factor of C{0.53} is used. @type use_cocoeval_oks: bool @param use_cocoeval_oks: Whether to use same OKS formula as in COCOeval or use the one from definition. Defaults to C{True}.
variable
variable
variable
method
method
class
luxonis_train.attached_modules.metrics.OCRAccuracy(luxonis_train.attached_modules.metrics.BaseMetric)
variable
variable
variable
variable
variable
variable
method
__init__(self, blank_class: int = 0, kwargs)
Initializes the OCR accuracy metric. @type blank_class: int @param blank_class: Index of the blank class. Defaults to C{0}.
variable
method
update(self, predictions: Tensor, target: Tensor)
Updates the running metric with the given predictions and targets. @type predictions: Tensor @param predictions: A tensor containing the network predictions. @type targets: Tensor @param targets: A tensor containing the target labels.
method
compute(self) -> tuple[Tensor, dict[str, Tensor]]: tuple[Tensor, dict[str, Tensor]]
Computes the OCR accuracy. @rtype: tuple[Tensor, dict[str, Tensor]] @return: A tuple containing the OCR accuracy and a dictionary of individual accuracies.
class
luxonis_train.attached_modules.metrics.Accuracy(luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper)
variable
class
luxonis_train.attached_modules.metrics.F1Score(luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper)
variable
class
luxonis_train.attached_modules.metrics.JaccardIndex(luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper)
variable
class
luxonis_train.attached_modules.metrics.Precision(luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper)
variable
class
luxonis_train.attached_modules.metrics.Recall(luxonis_train.attached_modules.metrics.torchmetrics.TorchMetricWrapper)
variable
package
luxonis_train.attached_modules.visualizers
module
module
module
module
module
module
module
module
module
module
class
BaseVisualizer
A base class for all visualizers. This class defines the basic interface for all visualizers. It utilizes automatic registration of defined subclasses to the VISUALIZERS registry.
class
class
class
class
class
InstanceSegmentationVisualizer
Visualizer for instance segmentation tasks, supporting the visualization of predicted and ground truth bounding boxes and instance segmentation masks.
class
class
OCRVisualizer
Visualizer for OCR tasks.
class
function
combine_visualizations(visualization: Tensor
|
tuple
[
Tensor
,
Tensor
]
|
tuple
[
Tensor
,
list
[
Tensor
]
]) -> Tensor: Tensor
Default way of combining multiple visualizations into one final image.
function
denormalize(img: Tensor, mean: list
[
float
]
|
float
|
None = None, std: list
[
float
]
|
float
|
None = None, to_uint8: bool = False) -> Tensor: Tensor
Denormalizes an image back to original values, optionally converts it to uint8. @type img: Tensor @param img: Image to denormalize. @type mean: list[float] | float | None @param mean: Mean used for denormalization. Defaults to C{None}. @type std: list[float] | float | None @param std: Std used for denormalization. Defaults to C{None}. @type to_uint8: bool @param to_uint8: Whether to convert to uint8. Defaults to C{False}. @rtype: Tensor @return: denormalized image.
function
draw_bounding_box_labels(img: Tensor, label: Tensor, kwargs) -> Tensor: Tensor
Draws bounding box labels on an image. @type img: Tensor @param img: Image to draw on. @type label: Tensor @param label: Bounding box label. The shape should be (n_instances, 4), where the last dimension is (x, y, w, h). @type kwargs: dict @param kwargs: Additional arguments to pass to L{torchvision.utils.draw_bounding_boxes}. @rtype: Tensor @return: Image with bounding box labels drawn on.
function
draw_keypoint_labels(img: Tensor, label: Tensor, kwargs) -> Tensor: Tensor
Draws keypoint labels on an image. @type img: Tensor @param img: Image to draw on. @type label: Tensor @param label: Keypoint label. The shape should be (n_instances, 3), where the last dimension is (x, y, visibility). @type kwargs: dict @param kwargs: Additional arguments to pass to L{torchvision.utils.draw_keypoints}. @rtype: Tensor @return: Image with keypoint labels drawn on.
function
draw_segmentation_targets(image: Tensor, target: Tensor, alpha: float = 0.4, colors: Color
|
list
[
Color
]
|
None = None) -> Tensor: Tensor
Draws segmentation labels on an image. @type image: Tensor @param image: Image to draw on. @type target: Tensor @param target: Segmentation label. @type alpha: float @param alpha: Alpha value for blending. Defaults to C{0.4}. @rtype: Tensor @return: Image with segmentation labels drawn on.
function
get_color(seed: int) -> Color: Color
Generates a random color from a seed. @type seed: int @param seed: Seed to use for the generator. @rtype: L{Color} @return: Generated color.
function
function
preprocess_images(imgs: Tensor, mean: list
[
float
]
|
float
|
None = None, std: list
[
float
]
|
float
|
None = None) -> Tensor: Tensor
Performs preprocessing on a batch of images. Preprocessing includes denormalizing and converting to uint8. @type imgs: Tensor @param imgs: Batch of images. @type mean: list[float] | float | None @param mean: Mean used for denormalization. Defaults to C{None}. @type std: list[float] | float | None @param std: Std used for denormalization. Defaults to C{None}. @rtype: Tensor @return: Batch of preprocessed images.
function
seg_output_to_bool(data: Tensor, binary_threshold: float = 0.5) -> Tensor: Tensor
Converts seg head output to 2D boolean mask for visualization.
module
luxonis_train.attached_modules.visualizers.base_visualizer
type variable
module
luxonis_train.attached_modules.visualizers.segmentation_visualizer
variable
module
luxonis_train.attached_modules.visualizers.utils
variable
function
figure_to_torch(fig: Figure, width: int, height: int) -> Tensor: Tensor
Converts a matplotlib `Figure` to a `Tensor`.
function
torch_img_to_numpy(img: Tensor, reverse_colors: bool = False) -> npt.NDArray[np.uint8]: npt.NDArray[np.uint8]
Converts a torch image (CHW) to a numpy array (HWC). Optionally also converts colors. @type img: Tensor @param img: Torch image (CHW) @type reverse_colors: bool @param reverse_colors: Whether to reverse colors (RGB to BGR). Defaults to False. @rtype: npt.NDArray[np.uint8] @return: Numpy image (HWC)
function
numpy_to_torch_img(img: np.ndarray) -> Tensor: Tensor
Converts numpy image (HWC) to torch image (CHW).
function
number_to_hsl(seed: int) -> tuple[float, float, float]: tuple[float, float, float]
Map a number to a distinct HSL color.
function
hsl_to_rgb(hsl: tuple
[
float
,
float
,
float
]) -> Color: Color
Convert HSL color to RGB.
class
luxonis_train.attached_modules.visualizers.BaseVisualizer(luxonis_train.attached_modules.BaseAttachedModule)
method
forward(self, target_canvas: Tensor, prediction_canvas: Tensor, args: Unpack
[
Ts
]) -> Tensor|tuple[Tensor, Tensor]|tuple[Tensor, list[Tensor]]|list[Tensor]: Tensor|tuple[Tensor, Tensor]|tuple[Tensor, list[Tensor]]|list[Tensor]
Forward pass of the visualizer. Takes an image and the prepared inputs from the `prepare` method and produces visualizations. Visualizations can be either: - A single image (I{e.g.} for classification, weight visualization). - A tuple of two images, representing (labels, predictions) (I{e.g.} for bounding boxes, keypoints). - A tuple of an image and a list of images, representing (labels, multiple visualizations) (I{e.g.} for segmentation, depth estimation). - A list of images, representing unrelated visualizations. @type target_canvas: Tensor @param target_canvas: An image to draw the labels on. @type prediction_canvas: Tensor @param prediction_canvas: An image to draw the predictions on. @type args: Unpack[Ts] @param args: Prepared inputs from the `prepare` method. @rtype: Tensor | tuple[Tensor, Tensor] | tuple[Tensor, list[Tensor]] | list[Tensor] @return: Visualizations. @raise IncompatibleError: If the inputs are not compatible with the module.
method
class
luxonis_train.attached_modules.visualizers.BBoxVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, labels: dict
[
int
,
str
]
|
list
[
str
]
|
None = None, draw_labels: bool = True, colors: dict
[
str
,
Color
]
|
list
[
Color
]
|
None = None, fill: bool = False, width: int
|
None = None, font: str
|
None = None, font_size: int
|
None = None, kwargs)
Visualizer for bounding box predictions. Creates a visualization of the bounding box predictions and labels. @type labels: dict[int, str] | list[str] | None @param labels: Either a dictionary mapping class indices to names, or a list of names. If list is provided, the label mapping is done by index. By default, no labels are drawn. @type draw_labels: bool @param draw_labels: Whether or not to draw labels. Defaults to C{True}. @type colors: dict[int, Color] | list[Color] | None @param colors: Either a dictionary mapping class indices to colors, or a list of colors. If list is provided, the color mapping is done by index. By default, random colors are used. @type fill: bool @param fill: Whether or not to fill the bounding boxes. Defaults to C{False}. @type width: int | None @param width: The width of the bounding box lines. Defaults to C{1}. @type font: str | None @param font: A filename containing a TrueType font. Defaults to C{None}. @type font_size: int | None @param font_size: The font size to use for the labels. Defaults to C{None}.
variable
variable
variable
variable
variable
variable
variable
method
method
method
forward(self, prediction_canvas: Tensor, target_canvas: Tensor, predictions: list
[
Tensor
], targets: Tensor
|
None) -> tuple[Tensor, Tensor]|Tensor: tuple[Tensor, Tensor]|Tensor
Creates a visualization of the bounding box predictions and labels. @type target_canvas: Tensor @param target_canvas: The canvas containing the labels. @type prediction_canvas: Tensor @param prediction_canvas: The canvas containing the predictions. @type prediction: Tensor @param prediction: The predicted bounding boxes. The shape should be [N, 6], where N is the number of bounding boxes and the last dimension is [x1, y1, x2, y2, class, conf]. @type targets: Tensor @param targets: The target bounding boxes.
class
luxonis_train.attached_modules.visualizers.ClassificationVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, include_plot: bool = True, font_scale: float = 1.0, color: tuple
[
int
,
int
,
int
] = (255, 0, 0), thickness: int = 2, multilabel: bool = False, kwargs)
Visualizer for classification tasks. @type include_plot: bool @param include_plot: Whether to include a plot of the class probabilities in the visualization. Defaults to C{True}.
variable
variable
variable
variable
variable
method
class
luxonis_train.attached_modules.visualizers.EmbeddingsVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, z_score_threshold: float = 3, kwargs)
Visualizer for embedding tasks like reID. @type z_score_threshold: float @param z_score_threshold: The threshold for filtering out outliers.
variable
variable
method
forward(self, prediction_canvas: Tensor, target_canvas: Tensor, predictions: Tensor, target: Tensor) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Creates a visualization of the embeddings. @type target_canvas: Tensor @param target_canvas: The canvas to draw the labels on. @type prediction_canvas: Tensor @param prediction_canvas: The canvas to draw the predictions on. @type embeddings: Tensor @param embeddings: The embeddings to visualize. @type target: Tensor @param target: Ids of the embeddings. @rtype: Tensor @return: An embedding space projection.
static method
method
method
class
luxonis_train.attached_modules.visualizers.FOMOVisualizer(luxonis_train.attached_modules.visualizers.BBoxVisualizer)
class
luxonis_train.attached_modules.visualizers.InstanceSegmentationVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, labels: dict
[
int
,
str
]
|
list
[
str
]
|
None = None, draw_labels: bool = True, colors: dict
[
str
,
Color
]
|
list
[
Color
]
|
None = None, fill: bool = False, width: int
|
None = None, font: str
|
None = None, font_size: int
|
None = None, alpha: float = 0.6, kwargs)
Visualizer for instance segmentation tasks. @type labels: dict[int, str] | list[str] | None @param labels: Dictionary mapping class indices to class labels. @type draw_labels: bool @param draw_labels: Whether to draw class labels on the visualizations. @type colors: dict[str, L{Color}] | list[L{Color}] | None @param colors: Dicionary mapping class labels to colors. @type fill: bool | None @param fill: Whether to fill the boundingbox with color. @type width: int | None @param width: Width of the bounding box Lines. @type font: str | None @param font: Font of the clas labels. @type font_size: int | None @param font_size: Font size of the class Labels. @type alpha: float @param alpha: Alpha value of the segmentation masks. Defaults to C{0.6}.
variable
variable
variable
variable
variable
variable
variable
variable
method
static method
method
forward(self, prediction_canvas: Tensor, target_canvas: Tensor, boundingbox: list
[
Tensor
], instance_segmentation: list
[
Tensor
], target_boundingbox: Tensor
|
None, target_instance_segmentation: Tensor
|
None) -> tuple[Tensor, Tensor]|Tensor: tuple[Tensor, Tensor]|Tensor
Creates visualizations of the predicted and target bounding boxes and instance masks. @type target_canvas: Tensor @param target_canvas: Tensor containing the target visualizations. @type prediction_canvas: Tensor @param prediction_canvas: Tensor containing the predicted visualizations. @type target_bboxes: Tensor | None @param target_bboxes: Tensor containing the target bounding boxes. @type target_masks: Tensor | None @param target_masks: Tensor containing the target instance masks. @type predicted_bboxes: list[Tensor] @param predicted_bboxes: List of tensors containing the predicted bounding boxes. @type predicted_masks: list[Tensor] @param predicted_masks: List of tensors containing the predicted instance masks.
class
luxonis_train.attached_modules.visualizers.KeypointVisualizer(luxonis_train.attached_modules.visualizers.BBoxVisualizer)
variable
method
__init__(self, visibility_threshold: float = 0.5, connectivity: list
[
tuple
[
int
,
int
]
]
|
None = None, visible_color: Color = 'red', nonvisible_color: Color
|
None = None, kwargs)
Visualizer for keypoints. @type visibility_threshold: float @param visibility_threshold: Threshold for visibility of keypoints. If the visibility of a keypoint is below this threshold, it is considered as not visible. Defaults to C{0.5}. @type connectivity: list[tuple[int, int]] | None @param connectivity: List of tuples of keypoint indices that define the connections in the skeleton. Defaults to C{None}. @type visible_color: L{Color} @param visible_color: Color of visible keypoints. Either a string or a tuple of RGB values. Defaults to C{"red"}. @type nonvisible_color: L{Color} | None @param nonvisible_color: Color of nonvisible keypoints. If C{None}, nonvisible keypoints are not drawn. Defaults to C{None}.
variable
variable
variable
variable
static method
static method
method
class
luxonis_train.attached_modules.visualizers.OCRVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, font_scale: float = 0.5, color: tuple
[
int
,
int
,
int
] = (0, 0, 0), thickness: int = 1, kwargs)
Initializes the OCR visualizer. @type font_scale: float @param font_scale: Font scale of the text. Defaults to C{0.5}. @type color: tuple[int, int, int] @param color: Color of the text. Defaults to C{(0, 0, 0)}. @type thickness: int @param thickness: Thickness of the text. Defaults to C{1}.
variable
variable
variable
method
forward(self, prediction_canvas: Tensor, target_canvas: Tensor, predictions: Tensor, targets: Tensor) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Creates a visualization of the OCR predictions and labels. @type label_canvas: Tensor @param label_canvas: The canvas to draw the labels on. @type prediction_canvas: Tensor @param prediction_canvas: The canvas to draw the predictions on. @type predictions: list[str] @param predictions: The predictions to visualize. @type targets: list[str] @param targets: The targets to visualize. @rtype: tuple[Tensor, Tensor] @return: A tuple of the label and prediction visualizations.
class
luxonis_train.attached_modules.visualizers.SegmentationVisualizer(luxonis_train.attached_modules.visualizers.BaseVisualizer)
variable
method
__init__(self, colors: Color
|
list
[
Color
]
|
None = None, background_class: int
|
None = 0, background_color: Color = '#000000', alpha: float = 0.6, kwargs)
Visualizer for segmentation tasks. @type colors: L{Color} | list[L{Color}] @param colors: Color of the segmentation masks. Defaults to C{"#5050FF"}. @type background_class: int | None @param background_class: Index of the background class. Defaults to C{0}. If set, the background class will be drawn with the `background_color`. @type background_color: L{Color} | None @param background_color: Color of the background class. Defaults to C{"#000000"}. @type alpha: float @param alpha: Alpha value of the segmentation masks. Defaults to C{0.6}.
variable
variable
variable
variable
variable
static method
static method
method
forward(self, prediction_canvas: Tensor, target_canvas: Tensor, predictions: Tensor, target: Tensor
|
None) -> tuple[Tensor, Tensor]|Tensor: tuple[Tensor, Tensor]|Tensor
Creates a visualization of the segmentation predictions and labels. @type target_canvas: Tensor @param target_canvas: The canvas to draw the labels on. @type prediction_canvas: Tensor @param prediction_canvas: The canvas to draw the predictions on. @type predictions: Tensor @param predictions: The predictions to visualize. @type targets: Tensor @param targets: The targets to visualize. @rtype: tuple[Tensor, Tensor] @return: A tuple of the label and prediction visualizations.
property
package
luxonis_train.callbacks
module
module
module
module
gpu_stats_monitor
GPU Stats Monitor Monitor and logs GPU stats during training.
module
module
module
module
module
module
module
class
class
EMACallback
Callback that updates the stored parameters using a moving average.
class
class
class
GradCamCallback
Callback to visualize gradients using Grad-CAM (experimental). Works only during validation.
class
class
LuxonisRichProgressBar
Custom rich text progress bar based on RichProgressBar from Pytorch Lightning.
class
LuxonisTQDMProgressBar
Custom text progress bar based on TQDMProgressBar from Pytorch Lightning.
class
class
TestOnTrainEnd
Callback to perform a test run at the end of the training.
class
class
UploadCheckpoint
Callback that uploads best checkpoint based on the validation loss.
module
luxonis_train.callbacks.ema
class
ModelEma
Model Exponential Moving Average. Keeps a moving average of everything in the model.state_dict (parameters and buffers).
class
luxonis_train.callbacks.ema.ModelEma(torch.nn.Module)
method
__init__(self, model: pl.LightningModule, decay: float = 0.9999, use_dynamic_decay: bool = True, decay_tau: float = 2000)
Constructs `ModelEma`. @type model: L{pl.LightningModule} @param model: Pytorch Lightning module. @type decay: float @param decay: Decay rate for the moving average. @type use_dynamic_decay: bool @param use_dynamic_decay: Use dynamic decay rate. @type decay_tau: float @param decay_tau: Decay tau for the moving average.
variable
variable
variable
variable
variable
method
update(self, model: pl.LightningModule)
Update the stored parameters using a moving average. Source: U{<https://github.com/huggingface/pytorch-image-models/blob/main/timm/utils/model_ema.py>} @license: U{Apache License 2.0<https://github.com/huggingface/pytorch-image-models/tree/main?tab=Apache-2.0-1-ov-file#readme>} @type model: L{pl.LightningModule} @param model: Pytorch Lightning module.
module
luxonis_train.callbacks.gradcam_visualizer
class
class
luxonis_train.callbacks.gradcam_visualizer.PLModuleWrapper(lightning.pytorch.LightningModule)
method
__init__(self, pl_module: lxt.LuxonisLightningModule, task: str)
Constructs `ModelWrapper`. @type pl_module: LuxonisLightningModule @param pl_module: The model to be wrapped. @type task: str @param task: The type of task (e.g., segmentation, detection, classification, keypoint_detection).
variable
variable
method
forward(self, inputs: Tensor, args, kwargs) -> Tensor: Tensor
Forward pass through the model, returning the output based on the task type. @type inputs: Tensor @param inputs: Input tensor for the model. @type args: Any @param args: Additional positional arguments. @type kwargs: Any @param kwargs: Additional keyword arguments. @rtype: Tensor @return: The processed output based on the task type.
module
luxonis_train.callbacks.needs_checkpoint
class
class
luxonis_train.callbacks.needs_checkpoint.NeedsCheckpoint(lightning.pytorch.Callback)
class
luxonis_train.callbacks.ArchiveOnTrainEnd(luxonis_train.callbacks.needs_checkpoint.NeedsCheckpoint)
method
on_train_end(self, _: pl.Trainer, pl_module: lxt.LuxonisLightningModule)
Archives the model on train end. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
class
luxonis_train.callbacks.EMACallback(lightning.pytorch.Callback)
method
__init__(self, decay: float = 0.5, use_dynamic_decay: bool = True, decay_tau: float = 2000)
Constructs `EMACallback`. @type decay: float @param decay: Decay rate for the moving average. @type use_dynamic_decay: bool @param use_dynamic_decay: Use dynamic decay rate. If True, the decay rate will be updated based on the number of updates. @type decay_tau: float @param decay_tau: Decay tau for the moving average.
variable
variable
variable
variable
variable
property
method
on_fit_start(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Initialize `ModelEma` to keep a copy of the moving average of the weights. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_train_batch_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule, outputs: STEP_OUTPUT, batch: Any, batch_idx: int)
Update the stored parameters using a moving average. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module. @type outputs: Any @param outputs: Outputs from the training step. @type batch: Any @param batch: Batch data. @type batch_idx: int @param batch_idx: Batch index.
method
on_validation_epoch_start(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Swap the model's weights to the EMA weights at the start of validation. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_validation_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Restore the original model weights after validation. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_test_epoch_start(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Swap the model's weights to the EMA weights at the start of testing. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_test_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Restore the original model weights after testing. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_train_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule)
Replace the model's weights with the EMA weights at the end of training. This final update ensures that the trained model uses the EMA weights. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
method
on_save_checkpoint(self, trainer: pl.Trainer, pl_module: pl.LightningModule, checkpoint: dict)
Save the EMA state dictionary into the checkpoint. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module. @type checkpoint: dict @param checkpoint: Pytorch Lightning checkpoint.
method
on_load_checkpoint(self, trainer: pl.Trainer, pl_module: pl.LightningModule, callback_state: dict)
Load the EMA state dictionary from the checkpoint. @type callback_state: dict @param callback_state: Pytorch Lightning callback state.
class
luxonis_train.callbacks.ExportOnTrainEnd(luxonis_train.callbacks.needs_checkpoint.NeedsCheckpoint)
method
on_train_end(self, _: pl.Trainer, pl_module: lxt.LuxonisLightningModule)
Exports the model on train end. @type trainer: L{pl.Trainer} @param trainer: Pytorch Lightning trainer. @type pl_module: L{pl.LightningModule} @param pl_module: Pytorch Lightning module.
class
luxonis_train.callbacks.GPUStatsMonitor(lightning.pytorch.Callback)
method
__init__(self, memory_utilization: bool = True, gpu_utilization: bool = True, intra_step_time: bool = False, inter_step_time: bool = False, fan_speed: bool = False, temperature: bool = False)
Automatically monitors and logs GPU stats during training stage. C{GPUStatsMonitor} is a callback and in order to use it you need to assign a logger in the C{Trainer}. GPU stats are mainly based on C{nvidia-smi --query-gpu} command. The description of the queries is as follows: - C{fan.speed} - The fan speed value is the percent of maximum speed that the device's fan is currently intended to run at. It ranges from 0 to 100 %. Note: The reported speed is the intended fan speed. If the fan is physically blocked and unable to spin, this output will not match the actual fan speed. Many parts do not report fan speeds because they rely on cooling via fans in the surrounding enclosure. - C{memory.used} - Total memory allocated by active contexts. - C{memory.free} - Total free memory. - C{utilization.gpu} - Percent of time over the past sample period during which one or more kernels was executing on the GPU. The sample period may be between 1 second and 1/6 second depending on the product. - C{utilization.memory} - Percent of time over the past sample period during which global (device) memory was being read or written. The sample period may be between 1 second and 1/6 second depending on the product. - C{temperature.gpu} - Core GPU temperature, in degrees C. - C{temperature.memory} - HBM memory temperature, in degrees C. @type memory_utilization: bool @param memory_utilization: Set to C{True} to monitor used, free and percentage of memory utilization at the start and end of each step. Defaults to C{True}. @type gpu_utilization: bool @param gpu_utilization: Set to C{True} to monitor percentage of GPU utilization at the start and end of each step. Defaults to C{True}. @type intra_step_time: bool @param intra_step_time: Set to C{True} to monitor the time of each step. Defaults to {False}. @type inter_step_time: bool @param inter_step_time: Set to C{True} to monitor the time between the end of one step and the start of the next step. Defaults to C{False}. @type fan_speed: bool @param fan_speed: Set to C{True} to monitor percentage of fan speed. Defaults to C{False}. @type temperature: bool @param temperature: Set to C{True} to monitor the memory and gpu temperature in degree Celsius. Defaults to C{False}. @raises MisconfigurationException: If NVIDIA driver is not installed, not running on GPUs, or C{Trainer} has no logger.
static method
method
method
method
method
class
luxonis_train.callbacks.GradCamCallback(lightning.pytorch.Callback)
method
__init__(self, target_layer: int, class_idx: int = 0, log_n_batches: int = 1, task: str = 'classification')
Constructs `GradCamCallback`. @type target_layer: int @param target_layer: Layer to visualize gradients. @type class_idx: int | None @param class_idx: Index of the class for visualization. Defaults to None. @type log_n_batches: int @param log_n_batches: Number of batches to log. Defaults to 1. @type task: str @param task: The type of task. Defaults to "classification".
variable
variable
variable
variable
method
setup(self, trainer: pl.Trainer, pl_module: lxt.LuxonisLightningModule, stage: str)
Initializes the model wrapper. @type trainer: pl.Trainer @param trainer: The PyTorch Lightning trainer. @type pl_module: LuxonisLightningModule @param pl_module: The LuxonisLightningModule. @type stage: str @param stage: The stage of the training loop.
variable
method
on_validation_batch_end(self, trainer: pl.Trainer, pl_module: lxt.LuxonisLightningModule, outputs: STEP_OUTPUT, batch: tuple
[
dict
[
str
,
Tensor
]
,
Packet
[
Tensor
]
], batch_idx: int)
At the end of first n batches, visualize the gradients using Grad-CAM. @type trainer: pl.Trainer @param trainer: The PyTorch Lightning trainer. @type pl_module: LuxonisLightningModule @param pl_module: The PyTorch Lightning module. @type outputs: STEP_OUTPUT @param outputs: The output of the model. @type batch: Any @param batch: The input batch. @type batch_idx: int @param batch_idx: The index of the batch.
method
visualize_gradients(self, trainer: pl.Trainer, pl_module: lxt.LuxonisLightningModule, images: Tensor, batch_idx: int)
Visualizes the gradients using Grad-CAM. @type trainer: pl.Trainer @param trainer: The PyTorch Lightning trainer. @type pl_module: pl.LightningModule @param pl_module: The PyTorch Lightning module. @type images: Tensor @param images: The input images. @type batch_idx: int @param batch_idx: The index of the batch.
variable
class
luxonis_train.callbacks.BaseLuxonisProgressBar(abc.ABC, lightning.pytorch.callbacks.ProgressBar)
method
method
print_results(self, stage: str, loss: float, metrics: Mapping
[
str
,
Mapping
[
str
,
(
int
|
str
|
float
)
]
])
Prints results to the console. This includes the stage name, loss value, and tables with metrics. @type stage: str @param stage: Stage name. @type loss: float @param loss: Loss value. @type metrics: Mapping[str, Mapping[str, int | str | float]] @param metrics: Metrics in format {table_name: table}.
class
luxonis_train.callbacks.LuxonisRichProgressBar(lightning.pytorch.callbacks.RichProgressBar, luxonis_train.callbacks.BaseLuxonisProgressBar)
class
luxonis_train.callbacks.LuxonisTQDMProgressBar(lightning.pytorch.callbacks.TQDMProgressBar, luxonis_train.callbacks.BaseLuxonisProgressBar)
class
luxonis_train.callbacks.MetadataLogger(lightning.pytorch.Callback)
method
__init__(self, hyperparams: list
[
str
])
Callback that logs training metadata. Metadata include all defined hyperparameters together with git hashes of luxonis-ml and luxonis-train packages. Also stores this information locally. @type hyperparams: list[str] @param hyperparams: List of hyperparameters to log.
variable
method
class
luxonis_train.callbacks.TestOnTrainEnd(lightning.pytorch.Callback)
class
luxonis_train.callbacks.TrainingManager(lightning.pytorch.callbacks.BaseFinetuning)
method
method
method
on_after_backward(self, trainer: pl.Trainer, pl_module: lxt.LuxonisLightningModule)
PyTorch Lightning hook that is called after the backward pass. @type trainer: pl.Trainer @param trainer: The trainer object. @type pl_module: pl.LightningModule @param pl_module: The pl_module object.
class
luxonis_train.callbacks.UploadCheckpoint(lightning.pytorch.Callback)
method
__init__(self)
Constructs `UploadCheckpoint`. @type upload_directory: str @param upload_directory: Path used as upload directory
variable
method
package
luxonis_train.config
module
package
class
class
class
class
class
class
class
module
luxonis_train.config.config
class
class
class
class
class
class
class
class
class
class
class
class
class
class
class
class
luxonis_train.config.config.ImageSize(typing.NamedTuple)
class
luxonis_train.config.config.FreezingConfig(luxonis_ml.utils.BaseModelExtraForbid)
class
luxonis_train.config.config.PredefinedModelConfig(luxonis_ml.typing.ConfigItem)
variable
variable
variable
variable
class
luxonis_train.config.config.ModelConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
variable
variable
variable
variable
variable
CLASS_METHOD
method
method
method
method
method
CLASS_METHOD
class
luxonis_train.config.config.TrackerConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
variable
variable
variable
variable
variable
variable
class
luxonis_train.config.config.LoaderConfig(luxonis_ml.typing.ConfigItem)
variable
variable
variable
variable
variable
CLASS_METHOD
method
class
luxonis_train.config.config.NormalizeAugmentationConfig(luxonis_ml.utils.BaseModelExtraForbid)
class
luxonis_train.config.config.AugmentationConfig(luxonis_ml.typing.ConfigItem)
variable
class
luxonis_train.config.config.PreprocessingConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
variable
variable
CLASS_METHOD
method
method
get_active_augmentations(self) -> list[ConfigItem]: list[ConfigItem]
Returns list of augmentations that are active. @rtype: list[AugmentationConfig] @return: Filtered list of active augmentation configs
class
luxonis_train.config.config.CallbackConfig(luxonis_ml.typing.ConfigItem)
variable
class
luxonis_train.config.config.OnnxExportConfig(luxonis_ml.utils.BaseModelExtraForbid)
class
luxonis_train.config.config.BlobconverterExportConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
class
luxonis_train.config.config.ArchiveConfig(luxonis_ml.utils.BaseModelExtraForbid)
class
luxonis_train.config.config.StorageConfig(luxonis_ml.utils.BaseModelExtraForbid)
class
luxonis_train.config.config.TunerConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
variable
variable
variable
variable
variable
package
luxonis_train.config.predefined_models
module
module
module
module
module
module
module
module
module
class
class
class
class
class
class
class
class
OCRRecognitionModel
A predefined model for OCR recognition tasks.
class
module
luxonis_train.config.predefined_models.anomaly_detection_model
type alias
class
function
get_variant(variant: VariantLiteral) -> AnomalyVariant: AnomalyVariant
Returns the specific variant configuration for the AnomalyDetectionModel.
class
luxonis_train.config.predefined_models.anomaly_detection_model.AnomalyVariant(pydantic.BaseModel)
module
luxonis_train.config.predefined_models.classification_model
type alias
class
function
get_variant(variant: VariantLiteral) -> ClassificationVariant: ClassificationVariant
Returns the specific variant configuration for the ClassificationModel.
class
luxonis_train.config.predefined_models.classification_model.ClassificationVariant(pydantic.BaseModel)
module
luxonis_train.config.predefined_models.detection_fomo_model
type alias
class
function
get_variant(variant: VariantLiteral) -> FOMOVariant: FOMOVariant
Returns the specific variant configuration for the FOMOModel.
class
luxonis_train.config.predefined_models.detection_fomo_model.FOMOVariant(pydantic.BaseModel)
module
luxonis_train.config.predefined_models.detection_model
type alias
class
function
get_variant(variant: VariantLiteral) -> DetectionVariant: DetectionVariant
Returns the specific variant configuration for the DetectionModel.
class
luxonis_train.config.predefined_models.detection_model.DetectionVariant(pydantic.BaseModel)
variable
variable
variable
variable
module
luxonis_train.config.predefined_models.instance_segmentation_model
type alias
class
function
get_variant(variant: VariantLiteral) -> InstanceSegmentationVariant: InstanceSegmentationVariant
Returns the specific variant configuration for the InstanceSegmentationModel.
class
luxonis_train.config.predefined_models.instance_segmentation_model.InstanceSegmentationVariant(pydantic.BaseModel)
module
luxonis_train.config.predefined_models.keypoint_detection_model
type alias
class
function
get_variant(variant: VariantLiteral) -> KeypointDetectionVariant: KeypointDetectionVariant
Returns the specific variant configuration for the KeypointDetectionModel.
class
luxonis_train.config.predefined_models.keypoint_detection_model.KeypointDetectionVariant(pydantic.BaseModel)
module
luxonis_train.config.predefined_models.ocr_recognition_model
type alias
type alias
class
function
get_variant(variant: VariantLiteral) -> OCRRecognitionVariant: OCRRecognitionVariant
Returns the specific variant configuration for the OCRRecognitionModel.
class
luxonis_train.config.predefined_models.ocr_recognition_model.OCRRecognitionVariant(pydantic.BaseModel)
variable
variable
variable
variable
module
luxonis_train.config.predefined_models.segmentation_model
type alias
class
function
get_variant(variant: VariantLiteral) -> SegmentationVariant: SegmentationVariant
Returns the specific variant configuration for the SegmentationModel.
class
luxonis_train.config.predefined_models.segmentation_model.SegmentationVariant(pydantic.BaseModel)
class
luxonis_train.config.predefined_models.AnomalyDetectionModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including RecSubNet and DiscSubNetHead.
property
losses
Defines the loss module for the anomaly detection task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the anomaly detection task.
class
luxonis_train.config.predefined_models.BasePredefinedModel(abc.ABC)
property
property
property
property
method
class
luxonis_train.config.predefined_models.ClassificationModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone and head.
property
losses
Defines the loss module for the classification task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the classification task.
class
luxonis_train.config.predefined_models.FOMOModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
property
property
property
property
class
luxonis_train.config.predefined_models.DetectionModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone, neck, and head.
property
losses
Defines the loss module for the detection task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the detection task.
class
luxonis_train.config.predefined_models.InstanceSegmentationModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone, neck, and head.
property
losses
Defines the loss module for the instance segmentation task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the instance segmentation task.
class
luxonis_train.config.predefined_models.KeypointDetectionModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone, neck, and head.
property
losses
Defines the loss module for the keypoint detection task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the keypoint detection task.
class
luxonis_train.config.predefined_models.OCRRecognitionModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone and head.
property
losses
Defines the loss module for the classification task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the detection task.
class
luxonis_train.config.predefined_models.SegmentationModel(luxonis_train.config.predefined_models.BasePredefinedModel)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
property
nodes
Defines the model nodes, including backbone and head.
property
losses
Defines the loss module for the segmentation task.
property
metrics
Defines the metrics used for evaluation.
property
visualizers
Defines the visualizer used for the segmentation task.
class
luxonis_train.config.AttachedModuleConfig(luxonis_ml.typing.ConfigItem)
class
luxonis_train.config.Config(luxonis_ml.utils.LuxonisConfig)
variable
variable
variable
variable
variable
variable
variable
constant
CLASS_METHOD
CLASS_METHOD
CLASS_METHOD
smart_auto_populate
Automatically populates config fields based on rules, with warnings.
class
luxonis_train.config.ExportConfig(luxonis_train.config.config.ArchiveConfig)
variable
variable
variable
variable
variable
variable
variable
variable
variable
CLASS_METHOD
class
luxonis_train.config.LossModuleConfig(luxonis_train.config.AttachedModuleConfig)
class
luxonis_train.config.MetricModuleConfig(luxonis_train.config.AttachedModuleConfig)
variable
class
luxonis_train.config.NodeConfig(luxonis_ml.typing.ConfigItem)
variable
variable
variable
variable
variable
variable
variable
class
luxonis_train.config.TrainerConfig(luxonis_ml.utils.BaseModelExtraForbid)
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
method
method
method
reorder_callbacks(self) -> Self: Self
Reorder callbacks so that EMA is the first callback, since it needs to be updated before other callbacks.
package
luxonis_train.core
module
package
class
LuxonisModel
Common logic of the core components. This class contains common logic of the core components (trainer, evaluator, exporter, etc.).
module
luxonis_train.core.core
type alias
package
luxonis_train.core.utils
module
luxonis_train.core.utils.archive_utils
class
function
get_inputs(path: Path) -> dict[str, ArchiveMetadataDict]: dict[str, ArchiveMetadataDict]
Get inputs of a model executable. @type path: Path @param path: Path to model executable file.
function
get_outputs(path: Path) -> dict[str, ArchiveMetadataDict]: dict[str, ArchiveMetadataDict]
Get outputs of a model executable. @type path: Path @param path: Path to model executable file.
function
get_head_configs(lightning_module: LuxonisLightningModule, outputs: list
[
dict
]) -> list[dict]: list[dict]
Get model heads. @type lightning_module: LuxonisLightningModule @param lightning_module: Lightning module. @type outputs: list[dict] @param outputs: List of NN Archive outputs. @rtype: list[dict] @return: List of head configurations.
class
luxonis_train.core.utils.archive_utils.ArchiveMetadataDict(typing.TypedDict)
module
luxonis_train.core.utils.export_utils
function
function
function
function
module
luxonis_train.core.utils.infer_utils
constant
constant
function
process_visualizations(visualizations: dict
[
str
,
dict
[
str
,
Tensor
]
], batch_size: int) -> dict[tuple[str, str], list[np.ndarray]]: dict[tuple[str, str], list[np.ndarray]]
Render or save visualizations.
function
prepare_and_infer_image(model: lxt.LuxonisModel, img: Tensor) -> LuxonisOutput: LuxonisOutput
Prepares the image for inference and runs the model.
function
function
infer_from_video(model: lxt.LuxonisModel, video_path: PathType, save_dir: Path
|
None)
Runs inference on individual frames from a video. @type model: L{LuxonisModel} @param model: The model to use for inference. @type video_path: PathType @param video_path: The path to the video. @type save_dir: Path | None @param save_dir: The directory to save the visualizations to. @type show: bool @param show: Whether to display the visualizations.
function
infer_from_loader(model: lxt.LuxonisModel, loader: torch_data.DataLoader, save_dir: PathType
|
None, img_paths: list
[
PathType
]
|
None = None)
Runs inference on images from the dataset. @type model: L{LuxonisModel} @param model: The model to use for inference. @type loader: torch_data.DataLoader @param loader: The loader to use for inference. @type save_dir: PathType | None @param save_dir: The directory to save the visualizations to. @type img_paths: list[Path] | None @param img_paths: The paths to the images.
function
infer_from_directory(model: lxt.LuxonisModel, img_paths: Iterable
[
PathType
], save_dir: Path
|
None)
Runs inference on individual images from a directory. @type model: L{LuxonisModel} @param model: The model to use for inference. @type img_paths: Iterable[Path] @param img_paths: Iterable of paths to the images. @type save_dir: Path | None @param save_dir: The directory to save the visualizations to.
function
infer_from_dataset(model: lxt.LuxonisModel, view: Literal
[
'
train
'
,
'
val
'
,
'
test
'
], save_dir: PathType
|
None)
Runs inference on images from the dataset. @type model: L{LuxonisModel} @param model: The model to use for inference. @type view: Literal["train", "val", "test"] @param view: The view of the dataset to use. @type save_dir: PathType | None @param save_dir: The directory to save the visualizations to.
module
luxonis_train.core.utils.train_utils
function
create_trainer(cfg: TrainerConfig, kwargs: Any) -> pl.Trainer: pl.Trainer
Creates Pytorch Lightning trainer. @type cfg: Config @param cfg: Configuration object. @param kwargs: Additional arguments to pass to the trainer. @rtype: pl.Trainer @return: Pytorch Lightning trainer.
module
luxonis_train.core.utils.tune_utils
function
get_trial_params(all_augs: list
[
str
], params: dict
[
str
,
Any
], trial: optuna.trial.Trial) -> dict[str, Any]: dict[str, Any]
Get trial parameters based on specified config.
class
luxonis_train.core.LuxonisModel
method
__init__(self, cfg: str
|
Params
|
Config
|
None, opts: Params
|
list
[
str
]
|
tuple
[
str
,
...
]
|
None = None)
Constructs a new Core instance. Loads the config and initializes loaders, dataloaders, augmentations, lightning components, etc. @type cfg: str | dict[str, Any] | Config @param cfg: Path to config file or config dict used to setup training @type opts: list[str] | tuple[str, ...] | dict[str, Any] | None @param opts: Argument dict provided through command line, used for config overriding
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
train(self, new_thread: bool = False, weights: PathType
|
None = None)
Runs training. @type new_thread: bool @param new_thread: Runs training in new thread if set to True. @type weights: str | None @param weights: Path to the weights. If user specifies weights in the config file, the weights provided here will take precedence.
method
export(self, save_path: PathType
|
None = None, weights: PathType
|
None = None, ignore_missing_weights: bool = False)
Runs export. @type save_path: PathType | None @param save_path: Directory where to save all exported model files. If not specified, files will be saved to the "export" directory in the run save directory. @type weights: PathType | None @param weights: Path to the checkpoint from which to load weights. If not specified, the value of `model.weights` from the configuration file will be used. The current weights of the model will be temporarily replaced with the weights from the specified checkpoint. @type ignore_missing_weights: bool @param ignore_missing_weights: If set to True, the warning about missing weights will be suppressed. @raises RuntimeError: If C{onnxsim} fails to simplify the model.
method
test(self, new_thread: bool = False, view: Literal
[
'
train
'
,
'
val
'
,
'
test
'
] = 'test', weights: PathType
|
None = None) -> Mapping[str, float]|None: Mapping[str, float]|None
Runs testing. @type new_thread: bool @param new_thread: Runs testing in a new thread if set to True. @type view: Literal["train", "test", "val"] @param view: Which view to run the testing on. Defauls to "test". @rtype: Mapping[str, float] | None @return: If new_thread is False, returns a dictionary test results. @type weights: PathType | None @param weights: Path to the checkpoint from which to load weights. If not specified, the value of `model.weights` from the configuration file will be used. The current weights of the model will be temporarily replaced with the weights from the specified checkpoint.
variable
method
infer(self, view: Literal
[
'
train
'
,
'
val
'
,
'
test
'
] = 'val', save_dir: PathType
|
None = None, source_path: PathType
|
None = None, weights: PathType
|
None = None)
Runs inference. @type view: str @param view: Which split to run the inference on. Valid values are: C{"train"}, C{"val"}, C{"test"}. Defaults to C{"val"}. @type save_dir: PathType | None @param save_dir: Directory where to save the visualizations. If not specified, visualizations will be rendered on the screen. @type source_path: PathType | None @param source_path: Path to the image file, video file or directory. If None, defaults to using dataset images. @type weights: PathType | None @param weights: Path to the checkpoint from which to load weights. If not specified, the value of `model.weights` from the configuration file will be used. The current weights of the model will be temporarily replaced with the weights from the specified checkpoint.
method
tune(self)
Runs Optuna tuning of hyperparameters.
variable
method
archive(self, path: PathType
|
None = None, weights: PathType
|
None = None) -> Path: Path
Generates an NN Archive out of a model executable. @type path: PathType | None @param path: Path to the model executable. If not specified, the model will be exported first. @type weights: PathType | None @param weights: Path to the checkpoint from which to load weights. If not specified, the value of `model.weights` from the configuration file will be used. The current weights of the model will be temporarily replaced with the weights from the specified checkpoint. @rtype: Path @return: Path to the generated NN Archive.
method
get_min_loss_checkpoint_path(self) -> str|None: str|None
Return best checkpoint path with respect to minimal validation loss. @rtype: str @return: Path to the best checkpoint with respect to minimal validation loss
method
get_best_metric_checkpoint_path(self) -> str|None: str|None
Return best checkpoint path with respect to best validation metric. @rtype: str @return: Path to the best checkpoint with respect to best validation metric
method
get_mlflow_logging_keys(self)
Returns a dictionary with two lists of keys: 1) "metrics" -> Keys expected to be logged as standard metrics 2) "artifacts" -> Keys expected to be logged as artifacts (e.g. confusion_matrix.json, visualizations)
package
luxonis_train.lightning
module
module
module
class
LuxonisLightningModule
Class representing the entire model. This class keeps track of the model graph, nodes, and attached modules. The model topology is defined as an acyclic graph of nodes. The graph is saved as a dictionary of predecessors.
class
module
luxonis_train.lightning.utils
type variable
type alias
class
class
function
compute_losses(cfg: Config, losses: dict
[
str
,
dict
[
str
,
(
Tensor
|
tuple
[
Tensor
,
dict
[
str
,
Tensor
]
]
)
]
], loss_weights: dict
[
str
,
float
], device: torch.device) -> tuple[Tensor, dict[str, Tensor]]: tuple[Tensor, dict[str, Tensor]]
Computes the final loss as a weighted sum of all the losses. @type losses: dict[str, dict[str, Tensor | tuple[Tensor, dict[str, Tensor]]]] @param losses: Dictionary of computed losses. Each node can have multiple losses attached. The first key identifies the node, the second key identifies the specific loss. Values are either single tensors or tuples of tensors and sub- losses. @rtype: tuple[Tensor, dict[str, Tensor]] @return: Tuple of final loss and dictionary of all losses for logging. The dictionary is in a format of C{{loss_name: loss_value}}.
function
function
function
function
function
build_optimizers(cfg: Config, parameters: Iterable
[
nn.Parameter
]) -> tuple[list[Optimizer], list[LRScheduler]]: tuple[list[Optimizer], list[LRScheduler]]
Configures model optimizers and schedulers.
function
build_callbacks(cfg: Config, main_metric: tuple
[
str
,
str
]
|
None, save_dir: Path, nodes: Nodes) -> list[pl.Callback]: list[pl.Callback]
Configures Pytorch Lightning callbacks.
function
function
postprocess_metrics(name: str, values: Any) -> dict[str, Tensor]: dict[str, Tensor]
Convert metric computation result into a dictionary of values.
class
luxonis_train.lightning.utils.LossAccumulator(collections.defaultdict)
method
variable
method
method
class
luxonis_train.lightning.utils.Nodes((dict[str, BaseNode] if TYPE_CHECKING else nn.ModuleDict))
method
variable
variable
variable
variable
variable
property
method
method
method
method
class
luxonis_train.lightning.LuxonisLightningModule(lightning.pytorch.LightningModule)
variable
save_dir
Directory to save checkpoints and logs.
variable
nodes
Nodes of the model. Keys are node names, unique for each node.
variable
graph
Graph of the model in a format of a dictionary of predecessors. Keys are node names, values are inputs to the node (list of node names). Nodes with no inputs are considered inputs of the whole model.
variable
loss_weights
Dictionary of loss weights. Keys are loss names, values are weights.
variable
input_shapes
Dictionary of input shapes. Keys are node names, values are lists of shapes (understood as shapes of the "feature" field in Packet[Tensor]).
variable
outputs
List of output node names.
variable
losses
Nested dictionary of losses used in the model. Each node can have multiple losses attached. The first key identifies the node, the second key identifies the specific loss.
variable
visualizers
Dictionary of visualizers to be used with the model.
variable
metrics
Dictionary of metrics to be used with the model.
variable
dataset_metadata
Metadata of the dataset.
variable
main_metric
Name of the main metric to be used for model checkpointing. If not set, the model with the best metric score won't be saved.
variable
method
__init__(self, cfg: Config, save_dir: PathType, input_shapes: dict
[
str
,
Size
], dataset_metadata: DatasetMetadata
|
None = None, _core: luxonis_train.core.LuxonisModel
|
None = None, kwargs)
Constructs an instance of `LuxonisModel` from `Config`. @type cfg: L{Config} @param cfg: Config object. @type save_dir: str @param save_dir: Directory to save checkpoints. @type input_shapes: dict[str, Size] @param input_shapes: Dictionary of input shapes. Keys are input names, values are shapes. @type dataset_metadata: L{DatasetMetadata} | None @param dataset_metadata: Dataset metadata. @type kwargs: Any @param kwargs: Additional arguments to pass to the L{LightningModule} constructor.
variable
variable
variable
property
property
property
core
Returns the core model.
method
forward(self, inputs: dict
[
str
,
Tensor
], labels: Labels
|
None = None, images: Tensor
|
None = None, compute_loss: bool = True, compute_metrics: bool = False, compute_visualizations: bool = False) -> LuxonisOutput: LuxonisOutput
Forward pass of the model. Traverses the graph and step-by-step computes the outputs of each node. Each next node is computed only when all of its predecessors are computed. Once the outputs are not needed anymore, they are removed from the memory. @type inputs: L{Tensor} @param inputs: Input tensor. @type task_labels: L{TaskLabels} | None @param task_labels: Labels dictionary. Defaults to C{None}. @type images: L{Tensor} | None @param images: Canvas tensor for visualizers. Defaults to C{None}. @type compute_loss: bool @param compute_loss: Whether to compute losses. Defaults to C{True}. @type compute_metrics: bool @param compute_metrics: Whether to update metrics. Defaults to C{True}. @type compute_visualizations: bool @param compute_visualizations: Whether to compute visualizations. Defaults to C{False}. @rtype: L{LuxonisOutput} @return: Output of the model.
method
export_onnx(self, save_path: str, kwargs) -> list[str]: list[str]
Exports the model to ONNX format. @type save_path: str @param save_path: Path where the exported model will be saved. @type kwargs: Any @param kwargs: Additional arguments for the L{torch.onnx.export} method. @rtype: list[str] @return: List of output names.
method
process_losses(self, losses_dict: dict
[
str
,
dict
[
str
,
(
Tensor
|
tuple
[
Tensor
,
dict
[
str
,
Tensor
]
]
)
]
]) -> tuple[Tensor, dict[str, Tensor]]: tuple[Tensor, dict[str, Tensor]]
Processes individual losses from the model run. Goes over the computed losses and computes the final loss as a weighted sum of all the losses. @type losses_dict: dict[str, dict[str, Tensor | tuple[Tensor, dict[str, Tensor]]]] @param losses_dict: Dictionary of computed losses. Each node can have multiple losses attached. The first key identifies the node, the second key identifies the specific loss. Values are either single tensors or tuples of tensors and sub- losses. @rtype: tuple[Tensor, dict[str, Tensor]] @return: Tuple of final loss and dictionary of processed sub- losses. The dictionary is in a format of {loss_name: loss_value}.
method
method
method
method
method
method
method
method
method
method
method
load_checkpoint(self, path: str
|
Path
|
None)
Loads checkpoint weights from provided path. Loads the checkpoints gracefully, ignoring keys that are not found in the model state dict or in the checkpoint. @type path: str | None @param path: Path to the checkpoint. If C{None}, no checkpoint will be loaded.
method
get_mlflow_logging_keys(self) -> dict[str, list[str]]: dict[str, list[str]]
Returns a dictionary with two lists of keys: 1) "metrics" -> Keys expected to be logged as standard metrics 2) "artifacts" -> Keys expected to be logged as artifacts (e.g. confusion_matrix.json, visualizations)
class
luxonis_train.lightning.LuxonisOutput
package
luxonis_train.loaders
module
module
module
module
module
class
class
class
type alias
function
collate_fn(batch: list
[
LuxonisLoaderTorchOutput
]) -> tuple[dict[str, Tensor], Labels]: tuple[dict[str, Tensor], Labels]
Default collate function used for training. @type batch: list[LuxonisLoaderTorchOutput] @param batch: List of loader outputs (dict of Tensors) and labels (dict of Tensors) in the LuxonisLoaderTorchOutput format. @rtype: tuple[dict[str, Tensor], dict[str, Tensor]] @return: Tuple of inputs and annotations in the format expected by the model.
module
luxonis_train.loaders.luxonis_perlin_loader_torch
module
luxonis_train.loaders.perlin
function
function
function
function
function
function
function
function
function
apply_anomaly_to_img(img: Tensor, anomaly_img: Tensor, beta: float
|
None = None) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Applies Perlin noise-based anomalies to a single image (C, H, W). @type img: Tensor @param img: The input image tensor of shape (C, H, W). @type anomaly_source_paths: list[str] @param anomaly_source_paths: List of file paths to the anomaly images. @type pixel_augs: list[Callable] | None @param pixel_augs: A list of albumentations augmentations to apply to the anomaly image. Defaults to C{None}. @type beta: float | None @param beta: A blending factor for anomaly and noise. If None, a random value in the range [0, 0.8] is used. Defaults to C{None}. @rtype: tuple[Tensor, Tensor] @return: A tuple containing: - augmented_img (Tensor): The augmented image with applied anomaly and Perlin noise. - perlin_mask (Tensor): The Perlin noise mask applied to the image.
class
luxonis_train.loaders.BaseLoaderTorch(torch.utils.data.Dataset, abc.ABC)
method
__init__(self, view: list
[
str
], height: int
|
None = None, width: int
|
None = None, augmentation_engine: str = 'albumentations', augmentation_config: list
[
ConfigItem
]
|
None = None, image_source: str = 'image', keep_aspect_ratio: bool = True, color_space: Literal
[
'
RGB
'
,
'
BGR
'
] = 'RGB')
Base abstract loader class that enforces LuxonisLoaderTorchOutput output label structure. @type view: list[str] @param view: List of splits that form the view. Usually contains only one split, e.g. C{["train"]} or C{["test"]}. However, more complex datasets can make use of multi-split views, e.g. C{["train_synthetic", "train_real"]}. @type height: int @param height: Height of the output image. @type width: int @param width: Width of the output image. @type augmentation_engine: str @param augmentation_engine: Name of the augmentation engine. Can be used to enable swapping between different augmentation engines or making use of pre-defined engines, e.g. C{AlbumentationsEngine}. @type augmentation_config: list[ConfigItem] | None @param augmentation_config: List of augmentation configurations. Individual configurations are in the form of:: class ConfigItem: name: str params: dict[str, JsonValue] Where C{name} is the name of the augmentation and C{params} is a dictionary of its parameters. Example:: ConfigItem( name="HorizontalFlip", params={"p": 0.5}, ) @type image_source: str @param image_source: Name of the image source. Only relevant for datasets with multiple image sources, e.g. C{"left"} and C{"right"}. This parameter defines which of these sources is used for visualizations. @type keep_aspect_ratio: bool @param keep_aspect_ratio: Whether to keep the aspect ratio of the output image after resizing. @type color_space: Literal["RGB", "BGR"] @param color_space: Color space of the output image.
property
image_source
Name of the input image group.
property
view
List of splits forming this dataset's view.
property
augmentation_engine
Name of the augmentation engine.
property
augmentation_config
List of augmentation configurations.
property
height
Height of the output image.
property
width
Width of the output image.
property
keep_aspect_ratio
Whether to keep the aspect ratio of the output image after resizing.
property
color_space
Color space of the output image.
property
input_shapes
Shape (c, h, w) of each loader group (sub-element), WITHOUT batch dimension. Examples: Single image input: { 'image': torch.Size([3, 224, 224]), } Image and segmentation input: { 'image': torch.Size([3, 224, 224]), 'segmentation': torch.Size([1, 224, 224]), } Left image, right image and disparity input: { 'left': torch.Size([3, 224, 224]), 'right': torch.Size([3, 224, 224]), 'disparity': torch.Size([1, 224, 224]), } Image, keypoints, and point cloud input: { 'image': torch.Size([3, 224, 224]), 'keypoints': torch.Size([17, 2]), 'point_cloud': torch.Size([20000, 3]), }
property
input_shape
Shape (c, h, w) of the input tensor, WITHOUT batch dimension.
method
method
method
__len__(self) -> int: int
Returns length of the dataset.
method
get(self, idx: int) -> tuple[(Tensor|dict[str, Tensor]), Labels]: tuple[(Tensor|dict[str, Tensor]), Labels]
Loads sample from dataset. @type idx: int @param idx: Sample index. @rtype: L{LuxonisLoaderTorchOutput} @return: Sample's data in L{LuxonisLoaderTorchOutput} format.
method
get_classes(self) -> dict[str, dict[str, int]]: dict[str, dict[str, int]]
Gets classes according to computer vision task. @rtype: dict[LabelType, dict[str, int]] @return: A dictionary mapping tasks to their classes as a mappings from class name to class IDs.
method
get_n_keypoints(self) -> dict[str, int]|None: dict[str, int]|None
Returns the dictionary defining the semantic skeleton for each class using keypoints. @rtype: dict[str, Dict] | None @return: A dictionary mapping classes to their skeleton definitions.
method
method
dict_numpy_to_torch(self, numpy_dictionary: dict
[
str
,
np.ndarray
]) -> dict[str, Tensor]: dict[str, Tensor]
Converts a dictionary of numpy arrays to a dictionary of torch tensors. @type numpy_dictionary: dict[str, np.ndarray] @param numpy_dictionary: Dictionary of numpy arrays. @rtype: dict[str, Tensor] @return: Dictionary of torch tensors.
method
read_image(self, path: str) -> npt.NDArray[np.uint8]: npt.NDArray[np.uint8]
Reads an image from a file and returns an unnormalized image as a numpy array. @type path: str @param path: Path to the image file. @rtype: np.ndarray[np.uint8] @return: Image as a numpy array.
class
luxonis_train.loaders.LuxonisLoaderTorch(luxonis_train.loaders.BaseLoaderTorch)
method
__init__(self, dataset_name: str
|
None = None, dataset_dir: str
|
None = None, dataset_type: DatasetType
|
None = None, team_id: str
|
None = None, bucket_type: Literal
[
'
internal
'
,
'
external
'
] = 'internal', bucket_storage: Literal
[
'
local
'
,
'
s3
'
,
'
gcs
'
,
'
azure
'
] = 'local', update_mode: Literal
[
'
always
'
,
'
if_empty
'
] = 'always', delete_existing: bool = True, kwargs)
Torch-compatible loader for Luxonis datasets. Can either use an already existing dataset or parse a new one from a directory. @type dataset_name: str | None @param dataset_name: Name of the dataset to load. If not provided, the C{dataset_dir} argument must be provided instead. If both C{dataset_dir} and C{dataset_name} are provided, the dataset will be parsed from the directory and saved with the provided name. @type dataset_dir: str | None @param dataset_dir: Path to the dataset directory. It can be either a local path or a URL. The data can be in a zip file. If not provided, C{dataset_name} of an existing dataset must be provided. @type dataset_type: str | None @param dataset_type: Type of the dataset. Only relevant when C{dataset_dir} is provided. If not provided, the type will be inferred from the directory structure. @type team_id: str | None @param team_id: Optional unique team identifier for the cloud. @type bucket_type: Literal["internal", "external"] @param bucket_type: Type of the bucket. Only relevant for remote datasets. Defaults to 'internal'. @type bucket_storage: Literal["local", "s3", "gcs", "azure"] @param bucket_storage: Type of the bucket storage. Defaults to 'local'. @type delete_existing: bool @param delete_existing: Only relevant when C{dataset_dir} is provided. By default, the dataset is parsed again every time the loader is created because the underlying data might have changed. If C{delete_existing} is set to C{False} and a dataset of the same name already exists, the existing dataset will be used instead of re-parsing the data.
variable
variable
method
property
method
method
method
method
method
class
luxonis_train.loaders.LuxonisLoaderPerlinNoise(luxonis_train.loaders.LuxonisLoaderTorch)
method
__init__(self, args, anomaly_source_path: str, noise_prob: float = 0.5, beta: float
|
None = None, kwargs)
Custom loader for LDF that adds Perlin noise during training with a given probability. @type anomaly_source_path: str @param anomaly_source_path: Path to the anomaly dataset from where random samples are drawn for noise. @type noise_prob: float @param noise_prob: The probability with which to apply Perlin noise. @type beta: float @param beta: The opacity of the anomaly mask. If None, a random value is chosen. It's advisable to set it to None.
variable
variable
variable
variable
variable
variable
method
method
package
luxonis_train.nodes
package
luxonis_train.nodes.activations
class
luxonis_train.nodes.activations.HSigmoid(torch.nn.Module)
method
__init__(self)
Hard-Sigmoid (approximated sigmoid) activation function from U{Searching for MobileNetV3<https://arxiv.org/abs/1905.02244>}.
variable
method
package
luxonis_train.nodes.backbones
module
package
module
package
package
package
package
module
package
package
package
package
module
module
class
class
class
class
class
module
luxonis_train.nodes.backbones.contextspatial
class
luxonis_train.nodes.backbones.contextspatial.SpatialPath(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.contextspatial.ContextPath(torch.nn.Module)
method
variable
variable
variable
variable
variable
method
variable
variable
variable
package
luxonis_train.nodes.backbones.ddrnet
module
luxonis_train.nodes.backbones.ddrnet.blocks
class
class
class
function
make_layer(block: type
[
nn.Module
], in_channels: int, channels: int, num_blocks: int, stride: int = 1, expansion: int = 1) -> nn.Sequential: nn.Sequential
Creates a sequential layer consisting of a series of blocks. @type block: Type[nn.Module] @param block: The block class to be used. @type in_channels: int @param in_channels: Number of input channels. @type channels: int @param channels: Number of output channels. @type num_blocks: int @param num_blocks: Number of blocks in the layer. @type stride: int @param stride: Stride for the first block. Defaults to 1. @type expansion: int @param expansion: Expansion factor for the block. Defaults to 1. @return: A sequential container of the blocks.
class
luxonis_train.nodes.backbones.ddrnet.blocks.DAPPMBranch(torch.nn.Module)
method
__init__(self, kernel_size: int, stride: int, in_channels: int, branch_channels: int, inter_mode: str = 'bilinear')
A DAPPM branch. @type kernel_size: int @param kernel_size: The kernel size. When stride=0, this parameter is omitted, and AdaptiveAvgPool2d over all the input is performed. @type stride: int @param stride: Stride for the first convolution. When stride is set to 0, C{AdaptiveAvgPool2d} over all the input is performed (output is 1x1). When set to 1, no operation is performed. When stride>1, a convolution with C{stride=stride} is performed. @type in_channels: int @param in_channels: Number of input channels. @type branch_channels: int @param branch_channels: Width after the first convolution. @type inter_mode: str @param inter_mode: Interpolation mode for upscaling. Defaults to "bilinear".
variable
variable
variable
method
forward(self, x: Tensor
|
list
[
Tensor
]) -> Tensor: Tensor
Process input through the DAPPM branch. @type x: Tensor or list[Tensor] @param x: In branch 0 - the original input of the DAPPM. In other branches - a list containing the original input and the output of the previous branch. @return: Processed output tensor.
class
luxonis_train.nodes.backbones.ddrnet.blocks.DAPPM(torch.nn.Module)
method
__init__(self, in_channels: int, branch_channels: int, out_channels: int, kernel_sizes: list
[
int
], strides: list
[
int
], inter_mode: str = 'bilinear')
DAPPM (Dynamic Attention Pyramid Pooling Module). @type in_channels: int @param in_channels: Number of input channels. @type branch_channels: int @param branch_channels: Width after the first convolution in each branch. @type out_channels: int @param out_channels: Number of output channels. @type kernel_sizes: list[int] @param kernel_sizes: List of kernel sizes for each branch. @type strides: list[int] @param strides: List of strides for each branch. @type inter_mode: str @param inter_mode: Interpolation mode for upscaling. Defaults to "bilinear". @raises ValueError: If the lengths of C{kernel_sizes} and C{strides} are not the same.
variable
variable
variable
method
forward(self, x: Tensor) -> Tensor: Tensor
Forward pass through the DAPPM module. @type x: Tensor @param x: Input tensor. @return: Output tensor after processing through all branches and compression.
class
luxonis_train.nodes.backbones.ddrnet.blocks.BasicDDRBackbone(torch.nn.Module)
method
__init__(self, block: type
[
nn.Module
], stem_channels: int, layers: list
[
int
], in_channels: int, layer3_repeats: int = 1)
Initialize the BasicDDRBackBone with specified parameters. @type block: Type[nn.Module] @param block: The block class to use for layers. @type stem_channels: int @param stem_channels: Number of output channels in the stem layer. @type layers: list[int] @param layers: Number of blocks in each layer. @type in_channels: int @param in_channels: Number of input channels. @type layer3_repeats: int @param layer3_repeats: Number of repeats for layer3. Defaults to 1.
variable
variable
variable
variable
variable
variable
method
get_backbone_output_number_of_channels(self) -> dict[str, int]: dict[str, int]
Determine the number of output channels for each layer of the backbone. Returns a dictionary with keys "layer2", "layer3", "layer4" and their respective number of output channels. @return: Dictionary of output channel counts for each layer.
module
luxonis_train.nodes.backbones.ddrnet.variants
class
luxonis_train.nodes.backbones.ddrnet.variants.DDRNetVariant(pydantic.BaseModel)
class
luxonis_train.nodes.backbones.ddrnet.DDRNet(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, variant: Literal
[
'
23-slim
'
,
'
23
'
] = '23-slim', channels: int
|
None = None, highres_channels: int
|
None = None, use_aux_heads: bool = True, upscale_module: nn.Module
|
None = None, spp_width: int = 128, ssp_inter_mode: str = 'bilinear', segmentation_inter_mode: str = 'bilinear', block: type
[
nn.Module
] = BasicResNetBlock, skip_block: type
[
nn.Module
] = BasicResNetBlock, layer5_block: type
[
nn.Module
] = Bottleneck, layer5_bottleneck_expansion: int = 2, spp_kernel_sizes: list
[
int
]
|
None = None, spp_strides: list
[
int
]
|
None = None, layer3_repeats: int = 1, layers: list
[
int
]
|
None = None, download_weights: bool = True, kwargs)
DDRNet backbone. @see: U{Adapted from <https://github.com/Deci-AI/super-gradients/blob/master/src /super_gradients/training/models/segmentation_models/ddrnet.py>} @see: U{Original code <https://github.com/ydhongHIT/DDRNet>} @see: U{Paper <https://arxiv.org/pdf/2101.06085.pdf>} @license: U{Apache License, Version 2.0 <https://github.com/Deci-AI/super- gradients/blob/master/LICENSE.md>} @type variant: Literal["23-slim", "23"] @param variant: DDRNet variant. Defaults to "23-slim". The variant determines the number of channels and highres_channels. The following variants are available: - "23-slim" (default): channels=32, highres_channels=64 - "23": channels=64, highres_channels=128 @type channels: int | None @param channels: Base number of channels. If provided, overrides the variant values. @type highres_channels: int | None @param highres_channels: Number of channels in the high resolution net. If provided, overrides the variant values. @type use_aux_heads: bool @param use_aux_heads: Whether to use auxiliary heads. Defaults to True. @type upscale_module: nn.Module @param upscale_module: Module for upscaling (e.g., bilinear interpolation). Defaults to UpscaleOnline(). @type spp_width: int @param spp_width: Width of the branches in the SPP block. Defaults to 128. @type ssp_inter_mode: str @param ssp_inter_mode: Interpolation mode for the SPP block. Defaults to "bilinear". @type segmentation_inter_mode: str @param segmentation_inter_mode: Interpolation mode for the segmentation head. Defaults to "bilinear". @type block: type[nn.Module] @param block: type of block to use in the backbone. Defaults to BasicResNetBlock. @type skip_block: type[nn.Module] @param skip_block: type of block for skip connections. Defaults to BasicResNetBlock. @type layer5_block: type[nn.Module] @param layer5_block: type of block for layer5 and layer5_skip. Defaults to Bottleneck. @type layer5_bottleneck_expansion: int @param layer5_bottleneck_expansion: Expansion factor for Bottleneck block in layer5. Defaults to 2. @type spp_kernel_sizes: list[int] @param spp_kernel_sizes: Kernel sizes for the SPP module pooling. Defaults to [1, 5, 9, 17, 0]. @type spp_strides: list[int] @param spp_strides: Strides for the SPP module pooling. Defaults to [1, 2, 4, 8, 0]. @type layer3_repeats: int @param layer3_repeats: Number of times to repeat the 3rd stage. Defaults to 1. @type layers: list[int] @param layers: Number of blocks in each layer of the backbone. Defaults to [2, 2, 2, 2, 1, 2, 2, 1]. @type download_weights: bool @param download_weights: If True download weights from COCO (if available for specified variant). Defaults to True.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
package
luxonis_train.nodes.backbones.efficientrep
module
luxonis_train.nodes.backbones.efficientrep.variants
class
luxonis_train.nodes.backbones.efficientrep.variants.EfficientRepVariant(pydantic.BaseModel)
variable
variable
variable
variable
class
luxonis_train.nodes.backbones.efficientrep.EfficientRep(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, variant: VariantLiteral = 'nano', channels_list: list
[
int
]
|
None = None, n_repeats: list
[
int
]
|
None = None, depth_mul: float
|
None = None, width_mul: float
|
None = None, block: Literal
[
'
RepBlock
'
,
'
CSPStackRepBlock
'
]
|
None = None, csp_e: float
|
None = None, download_weights: bool = True, initialize_weights: bool = True, kwargs)
Implementation of the EfficientRep backbone. Supports the version with RepBlock and CSPStackRepBlock (for larger networks) Adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. @type variant: Literal["n", "nano", "s", "small", "m", "medium", "l", "large"] @param variant: EfficientRep variant. Defaults to "nano". The variant determines the depth and width multipliers, block used and intermediate channel scaling factor. The depth multiplier determines the number of blocks in each stage and the width multiplier determines the number of channels. The following variants are available: - "n" or "nano" (default): depth_multiplier=0.33, width_multiplier=0.25, block=RepBlock, e=None - "s" or "small": depth_multiplier=0.33, width_multiplier=0.50, block=RepBlock, e=None - "m" or "medium": depth_multiplier=0.60, width_multiplier=0.75, block=CSPStackRepBlock, e=2/3 - "l" or "large": depth_multiplier=1.0, width_multiplier=1.0, block=CSPStackRepBlock, e=1/2 @type channels_list: list[int] | None @param channels_list: List of number of channels for each block. If unspecified, defaults to [64, 128, 256, 512, 1024]. @type n_repeats: list[int] | None @param n_repeats: List of number of repeats of RepVGGBlock. If unspecified, defaults to [1, 6, 12, 18, 6]. @type depth_mul: float @param depth_mul: Depth multiplier. If provided, overrides the variant value. @type width_mul: float @param width_mul: Width multiplier. If provided, overrides the variant value. @type block: Literal["RepBlock", "CSPStackRepBlock"] | None @param block: Base block used when building the backbone. If provided, overrides the variant value. @type csp_e: float | None @param csp_e: Factor that controls number of intermediate channels if block="CSPStackRepBlock". If provided, overrides the variant value. @type download_weights: bool @param download_weights: If True download weights from COCO (if available for specified variant). Defaults to True. @type initialize_weights: bool @param initialize_weights: If True, initialize weights of the model.
variable
variable
method
method
set_export_mode(self, mode: bool = True)
Reparametrizes instances of L{RepVGGBlock} in the network. @type mode: bool @param mode: Whether to set the export mode. Defaults to C{True}.
method
package
luxonis_train.nodes.backbones.efficientvit
module
luxonis_train.nodes.backbones.efficientvit.blocks
class
luxonis_train.nodes.backbones.efficientvit.blocks.DepthwiseSeparableConv(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 3, stride: int = 1, padding: int
|
str
|
None = None, use_bias: list
[
bool
]
|
None = None, activation: list
[
nn.Module
]
|
None = None, use_residual: bool = False)
Depthwise separable convolution. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. Defaults to 3. @type stride: int @param stride: Stride. Defaults to 1. @type use_bias: list[bool, bool] @param use_bias: Whether to use bias for the depthwise and pointwise convolutions. @type activation: list[nn.Module, nn.Module] @param activation: Activation functions for the depthwise and pointwise convolutions.
variable
variable
variable
method
class
luxonis_train.nodes.backbones.efficientvit.blocks.MobileBottleneckBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 3, stride: int = 1, expand_ratio: float = 6, use_bias: list
[
bool
]
|
None = None, use_norm: list
[
bool
]
|
None = None, activation: list
[
nn.Module
]
|
None = None, use_residual: bool = False)
MobileBottleneckBlock is a block used in the EfficientViT model. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. Defaults to 3. @type stride: int @param stride: Stride. Defaults to 1. @type expand_ratio: float @param expand_ratio: Expansion ratio. Defaults to 6. @type use_bias: list[bool, bool, bool] @param use_bias: Whether to use bias for the depthwise and pointwise convolutions. @type use_norm: list[bool, bool, bool] @param use_norm: Whether to use normalization for the depthwise and pointwise convolutions. @type activation: list[nn.Module, nn.Module, nn.Module] @param activation: Activation functions for the depthwise and pointwise convolutions. @type use_residual: bool @param use_residual: Whether to use residual connection. Defaults to False.
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.efficientvit.blocks.EfficientViTBlock(torch.nn.Module)
method
__init__(self, num_channels: int, attention_ratio: float = 1.0, head_dim: int = 32, expansion_factor: float = 4.0, aggregation_scales: tuple
[
int
,
...
] = (5))
EfficientVisionTransformerBlock is a modular component designed for multi-scale linear attention and local feature processing. @type num_channels: int @param num_channels: The number of input and output channels. @type attention_ratio: float @param attention_ratio: Ratio for determining the number of attention heads. Default is 1.0. @type head_dim: int @param head_dim: Dimension size for each attention head. Default is 32. @type expansion_factor: float @param expansion_factor: Factor by which channels expand in the local module. Default is 4.0. @type aggregation_scales: tuple[int, ...] @param aggregation_scales: Tuple defining the scales for aggregation in the attention module. Default is (5,).
variable
variable
method
forward(self, inputs: Tensor) -> Tensor: Tensor
Forward pass of the block. @param inputs: Input tensor with shape [batch, channels, height, width]. @return: Output tensor after attention and local feature processing.
class
luxonis_train.nodes.backbones.efficientvit.blocks.LightweightMLABlock(torch.nn.Module)
method
__init__(self, input_channels: int, output_channels: int, num_heads: int
|
None = None, head_ratio: float = 1.0, dimension: int = 8, use_bias: list
[
bool
]
|
None = None, use_norm: list
[
bool
]
|
None = None, activations: list
[
nn.Module
]
|
None = None, scale_factors: tuple
[
int
,
...
] = (5), epsilon: float = 1e-15, use_residual: bool = True, kernel_activation: nn.Module
|
None = None)
LightweightMLABlock is a modular component used in the EfficientViT framework. It facilitates efficient multi-scale linear attention. @param input_channels: Number of input channels. @param output_channels: Number of output channels. @param num_heads: Number of attention heads. Default is None. @param head_ratio: Ratio to determine the number of heads. Default is 1.0. @param dimension: Size of each head. Default is 8. @param biases: List specifying if bias is used in qkv and projection layers. @param norms: List specifying if normalization is applied in qkv and projection layers. @param activations: List of activation functions for qkv and projection layers. @param scale_factors: Tuple defining scales for aggregation. Default is (5,). @param epsilon: Epsilon value for numerical stability. Default is 1e-15.
variable
variable
variable
variable
variable
variable
variable
method
linear_attention(self, qkv_tensor: Tensor) -> Tensor: Tensor
Implements ReLU-based linear attention.
method
quadratic_attention(self, qkv_tensor: Tensor) -> Tensor: Tensor
Implements ReLU-based quadratic attention.
method
module
luxonis_train.nodes.backbones.efficientvit.variants
class
luxonis_train.nodes.backbones.efficientvit.variants.EfficientVitVariant(pydantic.BaseModel)
class
luxonis_train.nodes.backbones.efficientvit.EfficientViT(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, variant: VariantLiteral = 'n', width_list: list
[
int
]
|
None = None, depth_list: list
[
int
]
|
None = None, expand_ratio: int = 4, dim: int
|
None = None, kwargs)
EfficientViT backbone implementation based on a lightweight transformer architecture. This implementation is inspired by the architecture described in the paper: "EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction" (https://arxiv.org/abs/2205.14756). The EfficientViT model is designed to provide a balance between computational efficiency and performance, making it suitable for deployment on edge devices with limited resources. @type variant: Literal["n", "nano", "s", "small", "m", "medium", "l", "large"] @param variant: EfficientViT variant. Defaults to "nano". The variant determines the width, depth, and dimension of the network. The following variants are available: - "n" or "nano" (default): width_list=[8, 16, 32, 64, 128], depth_list=[1, 2, 2, 2, 2], dim=16 - "s" or "small": width_list=[16, 32, 64, 128, 256], depth_list=[1, 2, 3, 3, 4], dim=16 - "m" or "medium": width_list=[24, 48, 96, 192, 384], depth_list=[1, 3, 4, 4, 6], dim=32 - "l" or "large": width_list=[32, 64, 128, 256, 512], depth_list=[1, 4, 6, 6, 9], dim=32 @type width_list: list[int] | None @param width_list: List of number of channels for each block. If unspecified, defaults to the variant's width_list. @type depth_list: list[int] | None @param depth_list: List of number of layers in each block. If unspecified, defaults to the variant's depth_list. @type expand_ratio: int @param expand_ratio: Expansion ratio for the MobileBottleneckBlock. Defaults to 4. @type dim: int | None @param dim: Dimension of the transformer. Defaults to the variant's dim.
variable
variable
method
package
luxonis_train.nodes.backbones.ghostfacenet
module
luxonis_train.nodes.backbones.ghostfacenet.blocks
class
luxonis_train.nodes.backbones.ghostfacenet.blocks.GhostModuleV2(torch.nn.Module)
method
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.ghostfacenet.blocks.GhostBottleneckV2(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
method
module
luxonis_train.nodes.backbones.ghostfacenet.variants
class
luxonis_train.nodes.backbones.ghostfacenet.variants.BlockConfig(pydantic.BaseModel)
variable
variable
variable
variable
variable
class
luxonis_train.nodes.backbones.ghostfacenet.variants.GhostFaceNetsVariant(pydantic.BaseModel)
class
luxonis_train.nodes.backbones.ghostfacenet.GhostFaceNetV2(luxonis_train.nodes.base_node.BaseNode)
variable
variable
method
__init__(self, variant: Literal
[
'
V2
'
] = 'V2', kwargs)
GhostFaceNetsV2 backbone. GhostFaceNetsV2 is a convolutional neural network architecture focused on face recognition, but it is adaptable to generic embedding tasks. It is based on the GhostNet architecture and uses Ghost BottleneckV2 blocks. Source: U{https://github.com/Hazqeel09/ellzaf_ml/blob/main/ellzaf_ml/models/ghostfacenetsv2.py} @license: U{MIT License <https://github.com/Hazqeel09/ellzaf_ml/blob/main/LICENSE>} @see: U{GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations <https://www.researchgate.net/publication/369930264_GhostFaceNets_Lightweight_Face_Recognition_Model_from_Cheap_Operations>} @type variant: Literal["V2"] @param variant: Variant of the GhostFaceNets embedding model. Defaults to "V2" (which is the only variant available).
variable
method
module
luxonis_train.nodes.backbones.micronet.blocks
class
class
class
class
class
class
class
luxonis_train.nodes.backbones.micronet.blocks.MicroBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 3, stride: int = 1, expansion_ratios: tuple
[
int
,
int
] = (2, 2), groups_1: tuple
[
int
,
int
] = (0, 6), groups_2: tuple
[
int
,
int
] = (1, 1), use_dynamic_shift: tuple
[
int
,
int
,
int
] = (2, 0, 1), reduction_factor: int = 1, init_a: tuple
[
float
,
float
] = (1.0, 1.0), init_b: tuple
[
float
,
float
] = (0.0, 0.0))
MicroBlock: The basic building block of MicroNet. This block implements the Micro-Factorized Convolution and Dynamic Shift-Max activation. It can be configured to use different combinations of these components based on the network design. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Size of the convolution kernel. Defaults to 3. @type stride: int @param stride: Stride of the convolution. Defaults to 1. @type expansion_ratios: tuple[int, int] @param expansion_ratios: Expansion ratios for the intermediate channels. Defaults to (2, 2). @type groups_1: tuple[int, int] @param groups_1: Groups for the first set of convolutions. Defaults to (0, 6). @type groups_2: tuple[int, int] @param groups_2: Groups for the second set of convolutions. Defaults to (1, 1). @type use_dynamic_shift: tuple[int, int, int] @param use_dynamic_shift: Flags to use Dynamic Shift-Max in different positions. Defaults to (2, 0, 1). @type reduction_factor: int @param reduction_factor: Reduction factor for the squeeze-and-excitation-like operation. Defaults to 1. @type init_a: tuple[float, float] @param init_a: Initialization parameters for Dynamic Shift-Max. Defaults to (1.0, 1.0). @type init_b: tuple[float, float] @param init_b: Initialization parameters for Dynamic Shift-Max. Defaults to (0.0, 0.0).
variable
variable
variable
method
class
luxonis_train.nodes.backbones.micronet.blocks.ChannelShuffle(torch.nn.Module)
method
__init__(self, groups: int)
Shuffle the channels of the input tensor. This operation is used to mix information between groups after grouped convolutions. @type groups: int @param groups: Number of groups to divide the channels into before shuffling.
variable
method
class
luxonis_train.nodes.backbones.micronet.blocks.DYShiftMax(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, init_a: tuple
[
float
,
float
] = (0.0, 0.0), init_b: tuple
[
float
,
float
] = (0.0, 0.0), use_relu: bool = True, groups: int = 6, reduction: int = 4, expansion: bool = False)
Dynamic Shift-Max activation function. This module implements the Dynamic Shift-Max operation, which adaptively fuses and selects channel information based on the input. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type init_a: tuple[float, float] @param init_a: Initial values for the 'a' parameters. Defaults to (0.0, 0.0). @type init_b: tuple[float, float] @param init_b: Initial values for the 'b' parameters. Defaults to (0.0, 0.0). @type use_relu: bool @param use_relu: Whether to use ReLU activation. Defaults to True. @type groups: int @param groups: Number of groups for channel shuffling. Defaults to 6. @type reduction: int @param reduction: Reduction factor for the squeeze operation. Defaults to 4. @type expansion: bool @param expansion: Whether to use expansion in grouping. Defaults to False.
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.micronet.blocks.SpatialSepConvSF(torch.nn.Module)
class
luxonis_train.nodes.backbones.micronet.blocks.Stem(torch.nn.Module)
class
luxonis_train.nodes.backbones.micronet.blocks.DepthSpatialSepConv(torch.nn.Module)
module
luxonis_train.nodes.backbones.micronet.variants
class
class
constant
constant
constant
function
class
luxonis_train.nodes.backbones.micronet.variants.MicroBlockConfig(pydantic.BaseModel)
variable
variable
variable
variable
variable
variable
variable
variable
class
luxonis_train.nodes.backbones.micronet.variants.MicroNetVariant(pydantic.BaseModel)
variable
variable
variable
variable
variable
variable
class
luxonis_train.nodes.backbones.micronet.MicroNet(luxonis_train.nodes.base_node.BaseNode)
method
__init__(self, variant: Literal
[
'
M1
'
,
'
M2
'
,
'
M3
'
] = 'M1', out_indices: list
[
int
]
|
None = None, kwargs)
MicroNet backbone. This class creates the full MicroNet architecture based on the specified variant. It consists of a stem layer followed by multiple MicroBlocks. @type variant: Literal["M1", "M2", "M3"] @param variant: Model variant to use. Defaults to "M1". @type out_indices: list[int] | None @param out_indices: Indices of the output layers. If provided, overrides the variant value.
variable
variable
method
package
luxonis_train.nodes.backbones.mobileone
module
luxonis_train.nodes.backbones.mobileone.blocks
class
MobileOneBlock
MobileOne building block. This block has a multi-branched architecture at train-time and plain-CNN style architecture at inference time For more details, please refer to our paper: An Improved One millisecond Mobile Backbone
class
luxonis_train.nodes.backbones.mobileone.blocks.MobileOneBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, padding: int = 0, groups: int = 1, use_se: bool = False, n_conv_branches: int = 1)
Construct a MobileOneBlock module. @type in_channels: int @param in_channels: Number of channels in the input. @type out_channels: int @param out_channels: Number of channels produced by the block. @type kernel_size: int @param kernel_size: Size of the convolution kernel. @type stride: int @param stride: Stride size. Defaults to 1. @type padding: int @param padding: Zero-padding size. Defaults to 0. @type dilation: int @param dilation: Kernel dilation factor. Defaults to 1. @type groups: int @param groups: Group number. Defaults to 1. @type use_se: bool @param use_se: Whether to use SE-ReLU activations. Defaults to False. @type n_conv_branches: int @param n_conv_branches: Number of convolutional branches. Defaults to 1.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
forward(self, inputs: Tensor) -> Tensor: Tensor
Apply forward pass.
method
reparameterize(self)
Following works like U{RepVGG: Making VGG-style ConvNets Great Again <https://arxiv.org/pdf/2101.03697.pdf. We re-parameterize multi-branched>} architecture used at training time to obtain a plain CNN-like structure for inference.
variable
module
luxonis_train.nodes.backbones.mobileone.variants
class
luxonis_train.nodes.backbones.mobileone.variants.MobileOneVariant(pydantic.BaseModel)
variable
variable
variable
class
luxonis_train.nodes.backbones.mobileone.MobileOne(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, variant: Literal
[
'
s0
'
,
'
s1
'
,
'
s2
'
,
'
s3
'
,
'
s4
'
] = 's0', width_multipliers: tuple
[
float
,
float
,
float
,
float
]
|
None = None, n_conv_branches: int
|
None = None, use_se: bool
|
None = None, kwargs)
MobileOne: An efficient CNN backbone for mobile devices. The architecture focuses on reducing memory access costs and improving parallelism while allowing aggressive parameter scaling for better representation capacity. Different variants (S0-S4) offer various accuracy-latency tradeoffs. Key features: - Designed for low latency on mobile while maintaining high accuracy - Uses re-parameterizable branches during training that get folded at inference - Employs trivial over-parameterization branches for improved accuracy - Simple feed-forward structure at inference with no branches/skip connections - Variants achieve <1ms inference time on iPhone 12 with up to 75.9% top-1 ImageNet accuracy - Outperforms other efficient architectures like MobileNets on image classification, object detection and semantic segmentation tasks - Uses only basic operators available across platforms (no custom activations) Reference: U{MobileOne: An Improved One millisecond Mobile Backbone <https://arxiv.org/abs/2206.04040>} @type variant: Literal["s0", "s1", "s2", "s3", "s4"] @param variant: Specifies which variant of the MobileOne network to use. Defaults to "s0". Each variant specifies a predefined set of values for: - width multipliers - A tuple of 4 float values specifying the width multipliers for each stage of the network. If the use of SE blocks is disabled, the last two values are ignored. - number of convolution branches - An integer specifying the number of linear convolution branches in MobileOne block. - use of SE blocks - A boolean specifying whether to use SE blocks in the network. The variants are as follows: - s0 (default): width_multipliers=(0.75, 1.0, 1.0, 2.0), n_conv_branches=4, use_se=False - s1: width_multipliers=(1.5, 1.5, 2.0, 2.5), n_conv_branches=1, use_se=False - s2: width_multipliers=(1.5, 2.0, 2.5, 4.0), n_conv_branches=1, use_se=False - s3: width_multipliers=(2.0, 2.5, 3.0, 4.0), n_conv_branches=1, use_se=False - s4: width_multipliers=(3.0, 3.5, 3.5, 4.0), n_conv_branches=1, use_se=True @type width_multipliers: tuple[float, float, float, float] | None @param width_multipliers: Width multipliers for each stage. If provided, overrides the variant values. @type n_conv_branches: int | None @param n_conv_branches: Number of linear convolution branches in MobileOne block. If provided, overrides the variant values. @type use_se: bool | None @param use_se: Whether to use C{Squeeze-and-Excitation} blocks in the network. If provided, overrides the variant value.
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
set_export_mode(self, mode: bool = True)
Sets the module to export mode. Reparameterizes the model to obtain a plain CNN-like structure for inference. TODO: add more details @warning: The re-parametrization is destructive and cannot be reversed! @type export: bool @param export: Whether to set the export mode to True or False. Defaults to True.
package
luxonis_train.nodes.backbones.pplcnet_v3
module
luxonis_train.nodes.backbones.pplcnet_v3.blocks
class
luxonis_train.nodes.backbones.pplcnet_v3.blocks.Act(torch.nn.Module)
method
variable
variable
method
class
luxonis_train.nodes.backbones.pplcnet_v3.blocks.LearnableAffineBlock(torch.nn.Module)
class
luxonis_train.nodes.backbones.pplcnet_v3.blocks.LearnableRepLayer(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
variable
class
luxonis_train.nodes.backbones.pplcnet_v3.blocks.LCNetV3Block(torch.nn.Module)
module
luxonis_train.nodes.backbones.pplcnet_v3.variants
class
luxonis_train.nodes.backbones.pplcnet_v3.variants.PPLCNetV3Variant(pydantic.BaseModel)
variable
variable
variable
variable
class
luxonis_train.nodes.backbones.pplcnet_v3.PPLCNetV3(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, variant: Literal
[
'
rec-light
'
] = 'rec-light', scale: float
|
None = None, conv_kxk_num: int
|
None = None, det: bool
|
None = None, net_config: dict
[
str
,
list
[
list
[
(
int
|
bool
)
]
]
]
|
None = None, max_text_len: int = 40, kwargs)
PPLCNetV3 backbone. @see: U{Adapted from <https://github.com/PaddlePaddle/PaddleOCR/ blob/main/ppocr/modeling/backbones/rec_lcnetv3.py>} @see: U{Original code <https://github.com/PaddlePaddle/PaddleOCR>} @license: U{Apache License, Version 2.0 <https://github.com/PaddlePaddle/PaddleOCR/blob/main/LICENSE >} @type scale: float @param scale: Scale factor. Defaults to 0.95. @type conv_kxk_num: int @param conv_kxk_num: Number of convolution branches. Defaults to 4. @type det: bool @param det: Whether to use the detection backbone. Defaults to False. @type max_text_len: int @param max_text_len: Maximum text length. Defaults to 40.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
set_export_mode(self, mode: bool = True)
Reparametrizes instances of L{LearnableRepLayer} in the network. @type mode: bool @param mode: Whether to set the export mode. Defaults to C{True}.
method
module
luxonis_train.nodes.backbones.recsubnet.blocks
class
luxonis_train.nodes.backbones.recsubnet.blocks.ConvBlock(torch.nn.Module)
class
luxonis_train.nodes.backbones.recsubnet.blocks.Encoder(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.recsubnet.blocks.Decoder(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
class
luxonis_train.nodes.backbones.recsubnet.blocks.NanoEncoder(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.recsubnet.blocks.NanoDecoder(torch.nn.Module)
module
luxonis_train.nodes.backbones.recsubnet.recsubnet
type alias
function
get_variant(variant: VariantLiteral) -> int: int
Returns the base width for the specified variant.
class
luxonis_train.nodes.backbones.recsubnet.RecSubNet(luxonis_train.nodes.base_node.BaseNode)
variable
variable
variable
method
__init__(self, in_channels: int = 3, out_channels: int = 3, base_width: int
|
None = None, variant: VariantLiteral = 'l', kwargs)
RecSubNet: A reconstruction sub-network that consists of an encoder and a decoder. This model is designed to reconstruct the original image from an input image that contains noise or anomalies. The encoder extracts relevant features from the noisy input, and the decoder attempts to reconstruct the clean version of the image by eliminating the noise or anomalies. This architecture is based on the paper: "Data-Efficient Image Transformers: A Deeper Look" (https://arxiv.org/abs/2108.07610). @type in_channels: int @param in_channels: Number of input channels for the encoder. Defaults to 3. @type out_channels: int @param out_channels: Number of output channels for the decoder. Defaults to 3. @type base_width: int @param base_width: The base width of the network. Determines the number of filters in the encoder and decoder. @type variant: Literal["n", "l"] @param variant: The variant of the RecSubNet to use. "l" for large, "n" for nano (lightweight). Defaults to "l".
variable
variable
method
forward(self, x: Tensor) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Performs the forward pass through the encoder and decoder.
module
luxonis_train.nodes.backbones.repvgg.variants
class
luxonis_train.nodes.backbones.repvgg.variants.RepVGGVariant(pydantic.BaseModel)
class
luxonis_train.nodes.backbones.repvgg.RepVGG(luxonis_train.nodes.base_node.BaseNode)
variable
variable
method
__init__(self, variant: Literal
[
'
A0
'
,
'
A1
'
,
'
A2
'
] = 'A0', n_blocks: tuple
[
int
,
int
,
int
,
int
]
|
None = None, width_multiplier: tuple
[
float
,
float
,
float
,
float
]
|
None = None, override_groups_map: dict
[
int
,
int
]
|
None = None, use_se: bool = False, use_checkpoint: bool = False, kwargs)
RepVGG backbone. RepVGG is a VGG-style convolutional architecture. - Simple feed-forward topology without any branching. - 3x3 convolutions and ReLU activations. - No automatic search, manual refinement or compound scaling. @license: U{MIT <https://github.com/DingXiaoH/RepVGG/blob/main/LICENSE>}. @see: U{https://github.com/DingXiaoH/RepVGG} @see: U{https://paperswithcode.com/method/repvgg} @see: U{RepVGG: Making VGG-style ConvNets Great Again <https://arxiv.org/abs/2101.03697>} @type variant: Literal["A0", "A1", "A2"] @param variant: RepVGG model variant. Defaults to "A0". @type n_blocks: tuple[int, int, int, int] | None @param n_blocks: Number of blocks in each stage. @type width_multiplier: tuple[float, float, float, float] | None @param width_multiplier: Width multiplier for each stage. @type override_groups_map: dict[int, int] | None @param override_groups_map: Dictionary mapping layer index to number of groups. The layers are indexed starting from 0. @type use_se: bool @param use_se: Whether to use Squeeze-and-Excitation blocks. @type use_checkpoint: bool @param use_checkpoint: Whether to use checkpointing.
variable
variable
variable
variable
variable
variable
method
method
set_export_mode(self, mode: bool = True)
Reparametrizes instances of L{RepVGGBlock} in the network. @type mode: bool @param mode: Whether to set the export mode. Defaults to C{True}.
module
luxonis_train.nodes.backbones.rexnetv1
class
class
luxonis_train.nodes.backbones.rexnetv1.LinearBottleneck(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.backbones.ContextSpatial(luxonis_train.nodes.base_node.BaseNode)
method
__init__(self, context_backbone: str
|
nn.Module = 'MobileNetV2', backbone_kwargs: Kwargs
|
None = None, kwargs)
Context Spatial backbone introduced in BiseNetV1. Source: U{BiseNetV1<https://github.com/taveraantonio/BiseNetv1>} @see: U{BiseNetv1: Bilateral Segmentation Network for Real-time Semantic Segmentation <https://arxiv.org/abs/1808.00897>} @type context_backbone: str @param context_backbone: Backbone used in the context path. Can be either a string or a C{torch.nn.Module}. If a string argument is used, it has to be a name of a module stored in the L{NODES} registry. Defaults to C{MobileNetV2}. @type backbone_kwargs: dict @param backbone_kwargs: Keyword arguments for the backbone. Only used when the C{context_backbone} argument is a string.
variable
variable
variable
method
class
luxonis_train.nodes.backbones.EfficientNet(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, download_weights: bool = True, out_indices: list
[
int
]
|
None = None, kwargs)
EfficientNet backbone. EfficientNet is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a compound coefficient. Unlike conventional practice that arbitrary scales these factors, the EfficientNet scaling method uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. Source: U{https://github.com/rwightman/gen-efficientnet-pytorch} @license: U{Apache License, Version 2.0 <https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/LICENSE>} @see: U{https://paperswithcode.com/method/efficientnet} @see: U{EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks <https://arxiv.org/abs/1905.11946>} @type download_weights: bool @param download_weights: If C{True} download weights from imagenet. Defaults to C{True}. @type out_indices: list[int] | None @param out_indices: Indices of the output layers. Defaults to [0, 1, 2, 4, 6].
variable
variable
method
class
luxonis_train.nodes.backbones.MobileNetV2(luxonis_train.nodes.base_node.BaseNode)
method
__init__(self, download_weights: bool = True, out_indices: list
[
int
]
|
None = None, kwargs)
MobileNetV2 backbone. This class implements the MobileNetV2 model as described in: U{MobileNetV2: Inverted Residuals and Linear Bottlenecks <https://arxiv.org/pdf/1801.04381v4>} by Sandler I{et al.} The network consists of an initial fully convolutional layer, followed by 19 bottleneck residual blocks, and a final 1x1 convolution. It can be used as a feature extractor for tasks like image classification, object detection, and semantic segmentation. Key features: - Inverted residual structure with linear bottlenecks - Depth-wise separable convolutions for efficiency - Configurable width multiplier and input resolution @type download_weights: bool @param download_weights: If True download weights from imagenet. Defaults to True. @type out_indices: list[int] | None @param out_indices: Indices of the output layers. Defaults to [3, 6, 13, 18].
variable
variable
method
class
luxonis_train.nodes.backbones.ResNet(luxonis_train.nodes.base_node.BaseNode)
method
__init__(self, variant: Literal
[
'
18
'
,
'
34
'
,
'
50
'
,
'
101
'
,
'
152
'
] = '18', download_weights: bool = True, zero_init_residual: bool = False, groups: int = 1, width_per_group: int = 64, replace_stride_with_dilation: tuple
[
bool
,
bool
,
bool
] = (False, False, False), kwargs)
ResNet backbone. Implements the backbone of a ResNet (Residual Network) architecture. ResNet is designed to address the vanishing gradient problem in deep neural networks by introducing skip connections. These connections allow the network to learn residual functions with reference to the layer inputs, enabling training of much deeper networks. This backbone can be used as a feature extractor for various computer vision tasks such as image classification, object detection, and semantic segmentation. It provides a robust set of features that can be fine-tuned for specific applications. The architecture consists of stacked residual blocks, each containing convolutional layers, batch normalization, and ReLU activations. The skip connections can be either identity mappings or projections, depending on the block type. Source: U{https://pytorch.org/vision/main/models/resnet.html} @license: U{PyTorch<https://github.com/pytorch/pytorch/blob/master/LICENSE>} @param variant: ResNet variant, determining the depth and structure of the network. Options are: - "18": 18 layers, uses basic blocks, smaller model suitable for simpler tasks. - "34": 34 layers, uses basic blocks, good balance of depth and computation. - "50": 50 layers, introduces bottleneck blocks, deeper feature extraction. - "101": 101 layers, uses bottleneck blocks, high capacity for complex tasks. - "152": 152 layers, deepest variant, highest capacity but most computationally intensive. The number in each variant represents the total number of weighted layers. Deeper networks generally offer higher accuracy but require more computation. @type variant: Literal["18", "34", "50", "101", "152"] @default variant: "18" @type download_weights: bool @param download_weights: If True download weights trained on imagenet. Defaults to True. @type zero_init_residual: bool @param zero_init_residual: Zero-initialize the last BN in each residual branch, so that the residual branch starts with zeros, and each residual block behaves like an identity. This improves the model by 0.2~0.3% according to U{Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour <https://arxiv.org/abs/1706.02677>}. Defaults to C{False}. @type groups: int @param groups: Number of groups for each block. Defaults to 1. Can be set to a different value only for ResNet-50, ResNet-101, and ResNet-152. The width of the convolutional blocks is computed as C{int(in_channels * (width_per_group / 64.0)) * groups} @type width_per_group: int @param width_per_group: Number of channels per group. Defaults to 64. Can be set to a different value only for ResNet-50, ResNet-101, and ResNet-152. The width of the convolutional blocks is computed as C{int(in_channels * (width_per_group / 64.0)) * groups} @type replace_stride_with_dilation: tuple[bool, bool, bool] @param replace_stride_with_dilation: Tuple of booleans where each indicates if the 2x2 strides should be replaced with a dilated convolution instead. Defaults to (False, False, False). Can be set to a different value only for ResNet-50, ResNet-101, and ResNet-152.
variable
method
class
luxonis_train.nodes.backbones.ReXNetV1_lite(luxonis_train.nodes.base_node.BaseNode)
method
__init__(self, fix_head_stem: bool = False, divisible_value: int = 8, input_ch: int = 16, final_ch: int = 164, multiplier: float = 1.0, kernel_sizes: int
|
list
[
int
] = 3, out_indices: list
[
int
]
|
None = None, kwargs)
ReXNetV1 (Rank Expansion Networks) backbone, lite version. ReXNet proposes a new approach to designing lightweight CNN architectures by: - Studying proper channel dimension expansion at the layer level using rank analysis - Searching for effective channel configurations across the entire network - Parameterizing channel dimensions as a linear function of network depth Key aspects: - Uses inverted bottleneck blocks similar to MobileNetV2 - Employs a linear parameterization of channel dimensions across blocks - Replaces ReLU6 with SiLU (Swish-1) activation in certain layers - Incorporates Squeeze-and-Excitation modules ReXNet achieves state-of-the-art performance among lightweight models on ImageNet classification and transfers well to tasks like object detection and fine-grained classification. Source: U{https://github.com/clovaai/rexnet} @license: U{MIT <https://github.com/clovaai/rexnet/blob/master/LICENSE>} @copyright: 2021-present NAVER Corp. @see U{Rethinking Channel Dimensions for Efficient Model Design <https://arxiv.org/abs/2007.00992>} @type fix_head_stem: bool @param fix_head_stem: Whether to multiply head stem. Defaults to False. @type divisible_value: int @param divisible_value: Divisor used. Defaults to 8. @type input_ch: int @param input_ch: Starting channel dimension. Defaults to 16. @type final_ch: int @param final_ch: Final channel dimension. Defaults to 164. @type multiplier: float @param multiplier: Channel dimension multiplier. Defaults to 1.0. @type kernel_sizes: int | list[int] @param kernel_sizes: Kernel size for each block. Defaults to 3. @param out_indices: list[int] | None @param out_indices: Indices of the output layers. Defaults to [1, 4, 10, 17].
variable
variable
variable
method
module
luxonis_train.nodes.base_node
type variable
type variable
class
BaseNode
A base class for all model nodes. This class defines the basic interface for all nodes. Furthermore, it utilizes automatic registration of defined subclasses to a NODES registry. Inputs and outputs of nodes are defined as Packets. A Packet is a dictionary of lists of tensors. Each key in the dictionary represents a different output from the previous node. Input to the node is a list of Packets, output is a single Packet. When the node is called, the inputs are sent to the unwrap method. The unwrap method should return a valid input to the forward method. Outputs of the forward method are then sent to wrap method, which wraps the output into a Packet. The wrapped Packet is the final output of the node. The run method combines the unwrap, forward and wrap methods together with input validation. When subclassing, the following methods should be implemented: forward: Forward pass of the module. unwrap: Optional. Unwraps the inputs from the input packet. The default implementation expects a single input with features key. wrap: Optional. Wraps the output of the forward pass into a Packet[Tensor]. The default implementation expects wraps the output of the forward pass into a packet with either "features" or the task name as the key. Additionally, the following class attributes can be defined: attach_index: Index of previous output that this node attaches to. task: An instance of `luxonis_train.tasks.Task` that specifies the task of the node. Usually defined for head nodes. Example: class MyNode(BaseNode): task = Tasks.CLASSIFICATION def __init__(self, **kwargs): super().__init__(**kwargs) self.nn = nn.Sequential( nn.Linear(10, 10), nn.ReLU(), nn.Linear(10, 10), ) # Roughly equivalent to the default implementation def unwrap(self, inputs: list[Packet[Tensor]]) -> Tensor: assert len(inputs) == 1 assert "features" in inputs[0] return inputs[0]["features"] def forward(self, inputs: Tensor) -> Tensor: return self.nn(inputs) # Roughly equivalent to the default implementation def wrap(output: Tensor) -> Packet[Tensor]: # The key of the main node output have to be the same as the # default task name for it to be automatically recognized # by the attached modules. return {"classification": [output]}
class
luxonis_train.nodes.base_node.BaseNode(torch.nn.Module, abc.ABC, typing.Generic)
variable
attach_index
Index of previous output that this node attaches to. Can be a single integer to specify a single output, a tuple of two or three integers to specify a range of outputs or "all" to specify all outputs. Defaults to "all". Python indexing conventions apply.
variable
method
__init__(self, input_shapes: list
[
Packet
[
Size
]
]
|
None = None, original_in_shape: Size
|
None = None, dataset_metadata: DatasetMetadata
|
None = None, n_classes: int
|
None = None, n_keypoints: int
|
None = None, in_sizes: Size
|
list
[
Size
]
|
None = None, remove_on_export: bool = False, export_output_names: list
[
str
]
|
None = None, attach_index: AttachIndexType
|
None = None, task_name: str
|
None = None)
Constructor for the C{BaseNode}. @type input_shapes: list[Packet[Size]] | None @param input_shapes: List of input shapes for the module. @type original_in_shape: Size | None @param original_in_shape: Original input shape of the model. Some nodes won't function if not provided. @type dataset_metadata: L{DatasetMetadata} | None @param dataset_metadata: Metadata of the dataset. Some nodes won't function if not provided. @type n_classes: int | None @param n_classes: Number of classes in the dataset. Provide only in case C{dataset_metadata} is not provided. Defaults to None. @type in_sizes: Size | list[Size] | None @param in_sizes: List of input sizes for the node. Provide only in case the C{input_shapes} were not provided. @type remove_on_export: bool @param remove_on_export: If set to True, the node will be removed from the model during export. Defaults to False. @type export_output_names: list[str] | None @param export_output_names: List of output names for the export. @type attach_index: AttachIndexType @param attach_index: Index of previous output that this node attaches to. Can be a single integer to specify a single output, a tuple of two or three integers to specify a range of outputs or C{"all"} to specify all outputs. Defaults to "all". Python indexing conventions apply. If provided as a constructor argument, overrides the class attribute. @type task_name: str | None @param task_name: Specifies which task group from the dataset to use in case the dataset contains multiple tasks. Otherwise, the task group is inferred from the dataset metadata.
variable
variable
property
property
n_keypoints
Getter for the number of keypoints.
property
n_classes
Getter for the number of classes.
property
classes
Getter for the class mappings.
property
class_names
Getter for the class names.
property
input_shapes
Getter for the input shapes.
property
original_in_shape
Getter for the original input shape as [N, H, W].
property
dataset_metadata
Getter for the dataset metadata.
property
in_sizes
Simplified getter for the input shapes. Should work out of the box for most cases where the input_shapes are sufficiently simple. Otherwise, the input_shapes should be used directly. In case in_sizes were provided during initialization, they are returned directly. Example: >>> input_shapes = [{"features": [Size(64, 128, 128), Size(3, 224, 224)]}] >>> attach_index = -1 >>> in_sizes = Size(3, 224, 224) >>> input_shapes = [{"features": [Size(64, 128, 128), Size(3, 224, 224)]}] >>> attach_index = "all" >>> in_sizes = [Size(64, 128, 128), Size(3, 224, 224)]
property
in_channels
Simplified getter for the number of input channels. Should work out of the box for most cases where the input_shapes are sufficiently simple. Otherwise, the input_shapes should be used directly. If attach_index is set to "all" or is a slice, returns a list of input channels, otherwise returns a single value.
property
in_height
Simplified getter for the input height. Should work out of the box for most cases where the input_shapes are sufficiently simple. Otherwise, the input_shapes should be used directly.
property
in_width
Simplified getter for the input width. Should work out of the box for most cases where the input_shapes are sufficiently simple. Otherwise, the input_shapes should be used directly.
method
load_checkpoint(self, path: str
|
None = None, strict: bool = True)
Loads checkpoint for the module. If path is url then it downloads it locally and stores it in cache. @type path: str | None @param path: Path to local or remote .ckpt file. @type strict: bool @param strict: Whether to load weights strictly or not. Defaults to True.
property
export
Getter for the export mode.
method
set_export_mode(self, mode: bool = True)
Sets the module to export mode. @type mode: bool @param mode: Value to set the export mode to. Defaults to True.
property
remove_on_export
Getter for the remove_on_export attribute.
property
export_output_names
Getter for the export_output_names attribute.
method
unwrap(self, inputs: list
[
Packet
[
Tensor
]
]) -> ForwardInputT: ForwardInputT
Prepares inputs for the forward pass. Unwraps the inputs from the C{list[Packet[Tensor]]} input so they can be passed to the forward call. The default implementation expects a single input with C{features} key and returns the tensor or tensors at the C{attach_index} position. For most cases the default implementation should be sufficient. Exceptions are modules with multiple inputs. @type inputs: list[Packet[Tensor]] @param inputs: Inputs to the node. @rtype: ForwardInputT @return: Prepared inputs, ready to be passed to the L{forward} method. @raises ValueError: If the number of inputs is not equal to 1. In such cases the method has to be overridden.
method
forward(self, inputs: ForwardInputT) -> ForwardOutputT: ForwardOutputT
Forward pass of the module. @type inputs: L{ForwardInputT} @param inputs: Inputs to the module. @rtype: L{ForwardOutputT} @return: Result of the forward pass.
method
wrap(self, output: ForwardOutputT) -> Packet[Tensor]: Packet[Tensor]
Wraps the output of the forward pass into a C{Packet[Tensor]}. The default implementation expects a single tensor or a list of tensors and wraps them into a Packet with either the node task as a key or "features" key if task is not defined. Example:: >>> class FooNode(BaseNode): ... task = Tasks.CLASSIFICATION ... ... class BarNode(BaseNode): ... pass ... >>> node = FooNode() >>> node.wrap(torch.rand(1, 10)) {"classification": [Tensor(1, 10)]} >>> node = BarNode() >>> node.wrap([torch.rand(1, 10), torch.rand(1, 10)]) {"features": [Tensor(1, 10), Tensor(1, 10)]} @type output: ForwardOutputT @param output: Output of the forward pass. @rtype: L{Packet}[Tensor] @return: Wrapped output. @raises ValueError: If the C{output} argument is not a tensor or a list of tensors. In such cases the L{wrap} method should be overridden.
method
run(self, inputs: list
[
Packet
[
Tensor
]
]) -> Packet[Tensor]: Packet[Tensor]
Combines the forward pass with the wrapping and unwrapping of the inputs. @type inputs: list[Packet[Tensor]] @param inputs: Inputs to the module. @rtype: L{Packet}[Tensor] @return: Outputs of the module as a dictionary of list of tensors: C{{"features": [Tensor, ...], "segmentation": [Tensor]}} @raises RuntimeError: If default L{wrap} or L{unwrap} methods are not sufficient.
type variable
method
get_attached(self, lst: list
[
T
]) -> list[T]|T: list[T]|T
Gets the attached elements from a list. This method is used to get the attached elements from a list based on the C{attach_index} attribute. @type lst: list[T] @param lst: List to get the attached elements from. Can be either a list of tensors or a list of sizes. @rtype: list[T] | T @return: Attached elements. If C{attach_index} is set to C{"all"} or is a slice, returns a list of attached elements. @raises ValueError: If the C{attach_index} is invalid.
package
luxonis_train.nodes.blocks
module
class
class
class
class
class
class
class
class
DropPath
Drop paths (Stochastic Depth) per sample, when applied in the main path of residual blocks. Intended usage of this block is as follows: >>> class ResNetBlock(nn.Module): ... def __init__(self, ..., drop_path_rate: float): ... self.drop_path = DropPath(drop_path_rate) ... def forward(self, x): ... return x + self.drop_path(self.conv_bn_act(x))
class
class
class
class
class
class
class
class
class
UpscaleOnline
Upscale tensor to a specified size during the forward pass. This class supports cases where the required scale/size is only known when the input is received. Only the interpolation mode is set in advance.
function
autopad(kernel_size: T, padding: T
|
None = None) -> T: T
Compute padding based on kernel size. @type kernel_size: int | tuple[int, ...] @param kernel_size: Kernel size. @type padding: int | tuple[int, ...] | None @param padding: Padding. Defaults to None. @rtype: int | tuple[int, ...] @return: Computed padding. The output type is the same as the type of the C{kernel_size}.
class
luxonis_train.nodes.blocks.blocks.BottleRep(torch.nn.Module)
class
luxonis_train.nodes.blocks.DFL(torch.nn.Module)
method
__init__(self, reg_max: int = 16)
The DFL (Distribution Focal Loss) module processes input tensors by applying softmax over a specified dimension and projecting the resulting tensor to produce output logits. @type reg_max: int @param reg_max: Maximum number of regression outputs. Defaults to 16.
variable
method
class
luxonis_train.nodes.blocks.AttentionRefinmentBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int)
Attention Refinment block adapted from U{https://github.com/taveraantonio/BiseNetv1}. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels.
variable
variable
method
class
luxonis_train.nodes.blocks.BasicResNetBlock(torch.nn.Module)
method
__init__(self, in_planes: int, planes: int, stride: int = 1, expansion: int = 1, final_relu: bool = True, droppath_prob: float = 0.0)
A basic residual block for ResNet. @type in_planes: int @param in_planes: Number of input channels. @type planes: int @param planes: Number of output channels. @type stride: int @param stride: Stride for the convolutional layers. Defaults to 1. @type expansion: int @param expansion: Expansion factor for the output channels. Defaults to 1. @type final_relu: bool @param final_relu: Whether to apply a ReLU activation after the residual addition. Defaults to True. @type droppath_prob: float @param droppath_prob: Drop path probability for stochastic depth. Defaults to 0.0.
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.blocks.BlockRepeater(torch.nn.Module)
method
__init__(self, block: type
[
nn.Module
], in_channels: int, out_channels: int, n_blocks: int = 1)
Module which repeats the block n times. First block accepts in_channels and outputs out_channels while subsequent blocks accept out_channels and output out_channels. @type block: L{nn.Module} @param block: Block to repeat. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type n_blocks: int @param n_blocks: Number of blocks to repeat. Defaults to C{1}.
variable
method
class
luxonis_train.nodes.blocks.Bottleneck(torch.nn.Module)
method
__init__(self, in_planes: int, planes: int, stride: int = 1, expansion: int = 4, final_relu: bool = True, droppath_prob: float = 0.0)
A bottleneck block for ResNet. @type in_planes: int @param in_planes: Number of input channels. @type planes: int @param planes: Number of intermediate channels. @type stride: int @param stride: Stride for the second convolutional layer. Defaults to 1. @type expansion: int @param expansion: Expansion factor for the output channels. Defaults to 4. @type final_relu: bool @param final_relu: Whether to apply a ReLU activation after the residual addition. Defaults to True. @type droppath_prob: float @param droppath_prob: Drop path probability for stochastic depth. Defaults to 0.0.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.blocks.ConvModule(torch.nn.Sequential)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int
|
tuple
[
int
,
int
], stride: int
|
tuple
[
int
,
int
] = 1, padding: int
|
tuple
[
int
,
int
]
|
str = 0, dilation: int
|
tuple
[
int
,
int
] = 1, groups: int = 1, bias: bool = False, activation: nn.Module
|
None
|
Literal
[
False
] = None, use_norm: bool = True, norm_momentum: float = 0.1)
Conv2d + Optional BN + Activation. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. @type stride: int @param stride: Stride. Defaults to 1. @type padding: int | str @param padding: Padding. Defaults to 0. @type dilation: int @param dilation: Dilation. Defaults to 1. @type groups: int @param groups: Groups. Defaults to 1. @type bias: bool @param bias: Whether to use bias. Defaults to False. @type activation: L{nn.Module} | None | Literal[False] @param activation: Activation function. If None then nn.ReLU. If False then no activation. Defaults to None. @type use_norm: bool @param use_norm: Whether to use normalization. Defaults to True.
class
luxonis_train.nodes.blocks.CSPStackRepBlock(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.blocks.DropPath(torch.nn.Module)
method
__init__(self, drop_prob: float = 0.0, scale_by_keep: bool = True)
Initializes the DropPath module. @type drop_prob: float @param drop_prob: Probability of zeroing out individual vectors (channel dimension) of each feature map. Defaults to 0.0. @type scale_by_keep: bool @param scale_by_keep: Whether to scale the output by the keep probability. Enabled by default to maintain output mean & std in the same range as without DropPath. Defaults to True.
variable
variable
method
drop_path(self, x: Tensor, drop_prob: float = 0.0, scale_by_keep: bool = True) -> Tensor: Tensor
Drop paths (Stochastic Depth) per sample when applied in the main path of residual blocks. @type x: Tensor @param x: Input tensor. @type drop_prob: float @param drop_prob: Probability of dropping a path. Defaults to 0.0. @type scale_by_keep: bool @param scale_by_keep: Whether to scale the output by the keep probability. Defaults to True. @return: Tensor with dropped paths based on the provided drop probability.
method
class
luxonis_train.nodes.blocks.DWConvModule(luxonis_train.nodes.blocks.ConvModule)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, padding: int = 0, dilation: int = 1, bias: bool = False, activation: nn.Module
|
None = None)
Depth-wise Conv2d + BN + Activation. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. @type stride: int @param stride: Stride. Defaults to 1. @type padding: int @param padding: Padding. Defaults to 0. @type dilation: int @param dilation: Dilation. Defaults to 1. @type bias: bool @param bias: Whether to use bias. Defaults to False. @type activation: L{nn.Module} | None @param activation: Activation function. If None then nn.Relu.
class
luxonis_train.nodes.blocks.EfficientDecoupledBlock(torch.nn.Module)
method
__init__(self, n_classes: int, in_channels: int)
Efficient Decoupled block used for class and regression predictions. @type n_classes: int @param n_classes: Number of classes. @type in_channels: int @param in_channels: Number of input channels.
variable
variable
variable
method
class
luxonis_train.nodes.blocks.FeatureFusionBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, reduction: int = 1)
Feature Fusion block adapted from: U{https://github.com/taveraantonio/BiseNetv1}. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type reduction: int @param reduction: Reduction factor. Defaults to C{1}.
variable
variable
method
class
luxonis_train.nodes.blocks.RepVGGBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 3, stride: int = 1, padding: int = 1, groups: int = 1, use_se: bool = False)
RepVGGBlock is a basic rep-style block, including training and deploy status This code is based on U{https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py}. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. Defaults to C{3}. @type stride: int @param stride: Stride. Defaults to C{1}. @type padding: int @param padding: Padding. Defaults to C{1}. @type dilation: int @param dilation: Dilation. Defaults to C{1}. @type groups: int @param groups: Groups. Defaults to C{1}. @type padding_mode: str @param padding_mode: Padding mode. Defaults to C{"zeros"}. @type deploy: bool @param deploy: Whether to use deploy mode. Defaults to C{False}. @type use_se: bool @param use_se: Whether to use SqueezeExciteBlock. Defaults to C{False}.
variable
variable
variable
variable
variable
variable
variable
variable
method
method
variable
class
luxonis_train.nodes.blocks.SegProto(torch.nn.Module)
method
__init__(self, in_channels: int, mid_channels: int = 256, out_channels: int = 32)
Initializes the segmentation prototype generator. @type in_channels: int @param in_channels: Number of input channels. @type mid_channels: int @param mid_channels: Number of intermediate channels. Defaults to 256. @type out_channels: int @param out_channels: Number of output channels. Defaults to 32.
variable
variable
variable
variable
method
forward(self, x: Tensor) -> Tensor: Tensor
Defines the forward pass of the segmentation prototype generator. @type x: Tensor @param x: Input tensor. @rtype: Tensor @return: Processed tensor.
class
luxonis_train.nodes.blocks.SpatialPyramidPoolingBlock(torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 5)
Spatial Pyramid Pooling block with ReLU activation on three different scales. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. Defaults to C{5}.
variable
variable
variable
method
class
luxonis_train.nodes.blocks.SqueezeExciteBlock(torch.nn.Module)
method
__init__(self, in_channels: int, intermediate_channels: int, approx_sigmoid: bool = False, activation: nn.Module
|
None = None)
Squeeze and Excite block, Adapted from U{Squeeze-and-Excitation Networks<https://arxiv.org/pdf/1709.01507.pdf>}. Code adapted from U{https://github.com/apple/ml-mobileone/blob/main/mobileone.py}. @type in_channels: int @param in_channels: Number of input channels. @type intermediate_channels: int @param intermediate_channels: Number of intermediate channels. @type approx_sigmoid: bool @param approx_sigmoid: Whether to use approximated sigmoid function. Defaults to False. @type activation: L{nn.Module} | None @param activation: Activation function. Defaults to L{nn.ReLU}.
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.blocks.UpBlock(torch.nn.Sequential)
method
__init__(self, in_channels: int, out_channels: int, kernel_size: int = 2, stride: int = 2, upsample_mode: Literal
[
'
upsample
'
,
'
conv_transpose
'
] = 'upsample', inter_mode: str = 'bilinear', align_corners: bool = False)
Upsampling with ConvTranspose2D or Upsample (based on the mode). @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels. @type kernel_size: int @param kernel_size: Kernel size. Defaults to C{2}. @type stride: int @param stride: Stride. Defaults to C{2}. @type upsample_mode: Literal["upsample", "conv_transpose"] @param upsample_mode: Upsampling method, either 'conv_transpose' (for ConvTranspose2D) or 'upsample' (for nn.Upsample). @type inter_mode: str @param inter_mode: Interpolation mode used for nn.Upsample (e.g., 'bilinear', 'nearest'). @type align_corners: bool @param align_corners: Align corners option for upsampling methods that support it. Defaults to False.
class
luxonis_train.nodes.blocks.UpscaleOnline(torch.nn.Module)
method
__init__(self, mode: str = 'bilinear')
Initialize UpscaleOnline with the interpolation mode. @type mode: str @param mode: Interpolation mode for resizing. Defaults to "bilinear".
variable
method
forward(self, x: Tensor, output_height: int, output_width: int) -> Tensor: Tensor
Upscale the input tensor to the specified height and width. @type x: Tensor @param x: Input tensor to be upscaled. @type output_height: int @param output_height: Desired height of the output tensor. @type output_width: int @param output_width: Desired width of the output tensor. @return: Upscaled tensor.
package
luxonis_train.nodes.heads
module
module
module
module
package
module
module
module
module
module
module
module
module
class
BaseHead
Base class for all heads in the model.
class
class
class
class
class
class
class
class
class
class
class
package
luxonis_train.nodes.heads.discsubnet_head
module
luxonis_train.nodes.heads.discsubnet_head.blocks
class
luxonis_train.nodes.heads.discsubnet_head.blocks.Encoder(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.heads.discsubnet_head.blocks.Decoder(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.heads.discsubnet_head.blocks.NanoEncoder(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.heads.discsubnet_head.blocks.NanoDecoder(torch.nn.Module)
method
variable
variable
variable
method
module
luxonis_train.nodes.heads.discsubnet_head.discsubnet_head
type alias
function
get_variant(variant: VariantLiteral) -> int: int
Returns the base width for the specified variant.
class
luxonis_train.nodes.heads.discsubnet_head.DiscSubNetHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
variable
method
__init__(self, in_channels: list
[
int
]
|
int = 6, out_channels: int = 2, base_channels: int
|
None = None, variant: VariantLiteral = 'l', kwargs)
DiscSubNetHead: A discriminative sub-network that detects and segments anomalies in images. This model is designed to take an input image and generate a mask that highlights anomalies or regions of interest based on reconstruction. The encoder extracts relevant features from the input, while the decoder generates a mask that identifies areas of anomalies by distinguishing between the reconstructed image and the input. @type in_channels: list[int] | int @param in_channels: Number of input channels for the encoder. Defaults to 6. @type out_channels: int @param out_channels: Number of output channels for the decoder. Defaults to 2 (for segmentation masks). @type base_channels: int @param base_channels: The base number of filters used in the encoder and decoder blocks. If None, it is determined based on the variant. @type variant: Literal["n", "l"] @param variant: The variant of the DiscSubNetHead to use. "l" for large, "n" for nano (lightweight). Defaults to "l".
variable
variable
method
forward(self, inputs: list
[
Tensor
]) -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Performs the forward pass through the encoder and decoder.
method
wrap(self, output: tuple
[
Tensor
,
Tensor
]) -> Packet[Tensor]: Packet[Tensor]
Wraps the output into a packet.
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
module
luxonis_train.nodes.heads.ocr_ctc_head
module
luxonis_train.nodes.heads.precision_seg_bbox_head
function
refine_and_apply_masks(mask_prototypes: Tensor, predicted_masks: Tensor, bounding_boxes: Tensor, height: int, width: int, upsample: bool = False) -> Tensor: Tensor
Refine and apply masks to bounding boxes based on the mask head outputs. @type mask_prototypes: Tensor @param mask_prototypes: Tensor of shape [mask_dim, mask_height, mask_width]. @type predicted_masks: Tensor @param predicted_masks: Tensor of shape [num_masks, mask_dim], where num_masks is the number of detected masks. @type bounding_boxes: Tensor @param bounding_boxes: Tensor of shape [num_masks, 4], containing bounding box coordinates. @type height: int @param height: Height of the input image. @type width: int @param width: Width of the input image. @type upsample: bool @param upsample: If True, upsample the masks to the target image dimensions. Default is False. @rtype: Tensor @return: A binary mask tensor of shape [num_masks, height, width], where the masks are cropped according to their respective bounding boxes.
class
luxonis_train.nodes.heads.BaseHead(luxonis_train.nodes.base_node.BaseNode)
variable
parser
Parser to use for the head.
method
get_head_config(self) -> dict[str, Any]: dict[str, Any]
Get head configuration. @rtype: dict @return: Head configuration.
method
get_custom_head_config(self) -> dict[str, Any]: dict[str, Any]
Get custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.BiSeNetHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
variable
variable
method
__init__(self, intermediate_channels: int = 64, kwargs)
BiSeNet segmentation head. Source: U{BiseNetV1<https://github.com/taveraantonio/BiseNetv1>} @license: NOT SPECIFIED. @see: U{BiseNetv1: Bilateral Segmentation Network for Real-time Semantic Segmentation <https://arxiv.org/abs/1808.00897>} @type intermediate_channels: int @param intermediate_channels: How many intermediate channels to use. Defaults to C{64}.
variable
variable
variable
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.ClassificationHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
method
__init__(self, dropout_rate: float = 0.2, kwargs)
Simple classification head. Consists of a global average pooling layer followed by a dropout layer and a single linear layer. @type dropout_rate: float @param dropout_rate: Dropout rate before last layer, range C{[0, 1]}. Defaults to C{0.2}.
variable
method
method
get_custom_head_config(self) -> dict[str, bool]: dict[str, bool]
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.DDRNetSegmentationHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
variable
variable
variable
method
__init__(self, inter_channels: int = 64, inter_mode: Literal
[
'
nearest
'
,
'
linear
'
,
'
bilinear
'
,
'
bicubic
'
,
'
trilinear
'
,
'
area
'
,
'
pixel_shuffle
'
] = 'bilinear', download_weights: bool = False, kwargs)
DDRNet segmentation head. @see: U{Adapted from <https://github.com/Deci-AI/super-gradients/blob/master/src /super_gradients/training/models/segmentation_models/ddrnet.py>} @see: U{Original code <https://github.com/ydhongHIT/DDRNet>} @see: U{Paper <https://arxiv.org/pdf/2101.06085.pdf>} @license: U{Apache License, Version 2.0 <https://github.com/Deci-AI/super- gradients/blob/master/LICENSE.md>} @type inter_channels: int @param inter_channels: Width of internal conv. Must be a multiple of scale_factor^2 when inter_mode is pixel_shuffle. Defaults to 64. @type inter_mode: str @param inter_mode: Upsampling method. One of nearest, linear, bilinear, bicubic, trilinear, area or pixel_shuffle. If pixel_shuffle is set, nn.PixelShuffle is used for scaling. Defaults to "bilinear". @type download_weights: bool @param download_weights: If True download weights from COCO. Defaults to False.
variable
variable
variable
variable
variable
variable
variable
method
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.EfficientBBoxHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
method
__init__(self, n_heads: Literal
[
2
,
3
,
4
] = 3, conf_thres: float = 0.25, iou_thres: float = 0.45, max_det: int = 300, download_weights: bool = False, initialize_weights: bool = True, kwargs)
Head for object detection. Adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. @type n_heads: Literal[2,3,4] @param n_heads: Number of output heads. Defaults to 3. B{Note:} Should be same also on neck in most cases. @type conf_thres: float @param conf_thres: Threshold for confidence. Defaults to C{0.25}. @type iou_thres: float @param iou_thres: Threshold for IoU. Defaults to C{0.45}. @type max_det: int @param max_det: Maximum number of detections retained after NMS. Defaults to C{300}. @type download_weights: bool @param download_weights: If True download weights from COCO. Defaults to False. @type initialize_weights: bool @param initialize_weights: If True, initialize weights.
variable
variable
variable
variable
variable
variable
variable
variable
method
method
method
method
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.EfficientKeypointBBoxHead(luxonis_train.nodes.heads.EfficientBBoxHead)
variable
method
__init__(self, n_heads: Literal
[
2
,
3
,
4
] = 3, conf_thres: float = 0.25, iou_thres: float = 0.45, max_det: int = 300, kwargs)
Head for object and keypoint detection. Adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications<https://arxiv.org/pdf/2209.02976.pdf>}. @param n_heads: Number of output heads. Defaults to C{3}. B{Note:} Should be same also on neck in most cases. @type n_heads: int @param conf_thres: Threshold for confidence. Defaults to C{0.25}. @type conf_thres: float @param iou_thres: Threshold for IoU. Defaults to C{0.45}. @type iou_thres: float @param max_det: Maximum number of detections retained after NMS. Defaults to C{300}. @type max_det: int
variable
variable
method
method
variable
variable
variable
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.FOMOHead(luxonis_train.nodes.base_node.BaseNode)
variable
variable
method
__init__(self, num_conv_layers: int = 3, conv_channels: int = 16, use_nms: bool = True, kwargs)
FOMO Head for object detection using heatmaps. @type num_conv_layers: int @param num_conv_layers: Number of convolutional layers to use. @type conv_channels: int @param conv_channels: Number of channels to use in the convolutional layers.
variable
variable
variable
variable
variable
property
method
method
class
luxonis_train.nodes.heads.GhostFaceNetHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
method
__init__(self, embedding_size: int = 512, cross_batch_memory_size: int
|
None = None, dropout: float = 0.2, kwargs)
GhostFaceNetV2 backbone. GhostFaceNetV2 is a convolutional neural network architecture focused on face recognition, but it is adaptable to generic embedding tasks. It is based on the GhostNet architecture and uses Ghost BottleneckV2 blocks. Source: U{https://github.com/Hazqeel09/ellzaf_ml/blob/main/ellzaf_ml/models/ghostfacenetsv2.py} @license: U{MIT License <https://github.com/Hazqeel09/ellzaf_ml/blob/main/LICENSE>} @see: U{GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations <https://www.researchgate.net/publication/369930264_GhostFaceNets_Lightweight_Face_Recognition_Model_from_Cheap_Operations>} @type embedding_size: int @param embedding_size: Size of the embedding. Defaults to 512. @type cross_batch_memory_size: int | None @param cross_batch_memory_size: Size of the cross-batch memory. Defaults to None. @type dropout: float @param dropout: Dropout rate. Defaults to 0.2.
variable
variable
variable
method
class
luxonis_train.nodes.heads.OCRCTCHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
method
__init__(self, alphabet: list
[
str
], ignore_unknown: bool = True, fc_decay: float = 0.0004, mid_channels: int
|
None = None, return_feats: bool = False, kwargs)
OCR CTC head. @see: U{Adapted from <https://github.com/PaddlePaddle/PaddleOCR/ blob/main/ppocr/modeling/heads/rec_ctc_head.py>} @see: U{Original code <https://github.com/PaddlePaddle/PaddleOCR>} @license: U{Apache License, Version 2.0 <https://github.com/PaddlePaddle/PaddleOCR/blob/main/LICENSE >} @type alphabet: list[str] @param alphabet: List of characters. @type ignore_unknown: bool @param ignore_unknown: Whether to ignore unknown characters. Defaults to True. @type fc_decay: float @param fc_decay: L2 regularization factor. Defaults to 0.0004. @type mid_channels: int @param mid_channels: Number of middle channels. Defaults to None. @type return_feats: bool @param return_feats: Whether to return features. Defaults to False.
variable
variable
variable
variable
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
property
property
class
luxonis_train.nodes.heads.PrecisionBBoxHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
method
__init__(self, reg_max: int = 16, n_heads: Literal
[
2
,
3
,
4
] = 3, conf_thres: float = 0.25, iou_thres: float = 0.45, max_det: int = 300, kwargs)
Adapted from U{Real-Time Flying Object Detection with YOLOv8 <https://arxiv.org/pdf/2305.09972>} and from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. @type ch: tuple[int] @param ch: Channels for each detection layer. @type reg_max: int @param reg_max: Maximum number of regression channels. @type n_heads: Literal[2, 3, 4] @param n_heads: Number of output heads. @type conf_thres: float @param conf_thres: Confidence threshold for NMS. @type iou_thres: float @param iou_thres: IoU threshold for NMS. @type max_det: int @param max_det: Maximum number of detections retained after NMS.
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
method
method
method
bias_init(self)
Initialize biases for the detection heads. Assumes detection_heads structure with separate regression and classification branches.
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.PrecisionSegmentBBoxHead(luxonis_train.nodes.heads.PrecisionBBoxHead)
variable
method
__init__(self, n_heads: Literal
[
2
,
3
,
4
] = 3, n_masks: int = 32, n_proto: int = 256, conf_thres: float = 0.25, iou_thres: float = 0.45, max_det: int = 300, kwargs)
Head for instance segmentation and object detection. Adapted from U{Real-Time Flying Object Detection with YOLOv8 <https://arxiv.org/pdf/2305.09972>} and from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications <https://arxiv.org/pdf/2209.02976.pdf>}. @type n_heads: Literal[2, 3, 4] @param n_heads: Number of output heads. Defaults to 3. @type n_masks: int @param n_masks: Number of masks. @type n_proto: int @param n_proto: Number of prototypes for segmentation. @type conf_thres: flaot @param conf_thres: Confidence threshold for NMS. @type iou_thres: float @param iou_thres: IoU threshold for NMS. @type max_det: int @param max_det: Maximum number of detections retained after NMS.
variable
variable
variable
variable
method
method
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
class
luxonis_train.nodes.heads.SegmentationHead(luxonis_train.nodes.heads.BaseHead)
variable
variable
variable
variable
method
__init__(self, kwargs: Any)
Basic segmentation FCN head. Adapted from: U{https://github.com/pytorch/vision/blob/main/torchvision/models/segmentation/fcn.py} @license: U{BSD-3 <https://github.com/pytorch/vision/blob/main/LICENSE>}
variable
method
method
get_custom_head_config(self) -> dict: dict
Returns custom head configuration. @rtype: dict @return: Custom head configuration.
package
luxonis_train.nodes.necks
package
luxonis_train.nodes.necks.reppan_neck
module
luxonis_train.nodes.necks.reppan_neck.blocks
class
class
class
class
class
class
class
luxonis_train.nodes.necks.reppan_neck.blocks.PANUpBlockBase(abc.ABC, torch.nn.Module)
method
__init__(self, in_channels: int, out_channels: int)
Base RepPANNeck up block. @type in_channels: int @param in_channels: Number of input channels. @type out_channels: int @param out_channels: Number of output channels.
variable
variable
property
encode_block
Encode block that is used. Make sure actual module is initialized in the __init__ and not inside this function otherwise it will be reinitialized every time
method
class
luxonis_train.nodes.necks.reppan_neck.blocks.RepUpBlock(luxonis_train.nodes.necks.reppan_neck.blocks.PANUpBlockBase)
method
__init__(self, in_channels: int, in_channels_next: int, out_channels: int, n_repeats: int)
RepPANNeck up block for smaller networks that uses RepBlock. @type in_channels: int @param in_channels: Number of input channels. @type in_channels_next: int @param in_channels_next: Number of input channels of next input which is used in concat. @type out_channels: int @param out_channels: Number of output channels. @type n_repeats: int @param n_repeats: Number of RepVGGBlock repeats.
property
class
luxonis_train.nodes.necks.reppan_neck.blocks.CSPUpBlock(luxonis_train.nodes.necks.reppan_neck.blocks.PANUpBlockBase)
method
__init__(self, in_channels: int, in_channels_next: int, out_channels: int, n_repeats: int, e: float)
RepPANNeck up block for larger networks that uses CSPStackRepBlock. @type in_channels: int @param in_channels: Number of input channels. @type in_channels_next: int @param in_channels_next: Number of input channels of next input which is used in concat. @type out_channels: int @param out_channels: Number of output channels. @type n_repeats: int @param n_repeats: Number of RepVGGBlock repeats. @type e: float @param e: Factor that controls number of intermediate channels.
property
class
luxonis_train.nodes.necks.reppan_neck.blocks.PANDownBlockBase(abc.ABC, torch.nn.Module)
method
__init__(self, in_channels: int, downsample_out_channels: int)
Base RepPANNeck up block. @type in_channels: int @param in_channels: Number of input channels. @type downsample_out_channels: int @param downsample_out_channels: Number of output channels after downsample. @type in_channels_next: int @param in_channels_next: Number of input channels of next input which is used in concat. @type out_channels: int @param out_channels: Number of output channels. @type n_repeats: int @param n_repeats: Number of RepVGGBlock repeats.
variable
property
encode_block
Encode block that is used. Make sure actual module is initialized in the __init__ and not inside this function otherwise it will be reinitialized every time
method
class
luxonis_train.nodes.necks.reppan_neck.blocks.RepDownBlock(luxonis_train.nodes.necks.reppan_neck.blocks.PANDownBlockBase)
method
__init__(self, in_channels: int, downsample_out_channels: int, in_channels_next: int, out_channels: int, n_repeats: int)
RepPANNeck down block for smaller networks that uses RepBlock. @type in_channels: int @param in_channels: Number of input channels. @type downsample_out_channels: int @param downsample_out_channels: Number of output channels after downsample. @type in_channels_next: int @param in_channels_next: Number of input channels of next input which is used in concat. @type out_channels: int @param out_channels: Number of output channels. @type n_repeats: int @param n_repeats: Number of RepVGGBlock repeats.
property
class
luxonis_train.nodes.necks.reppan_neck.blocks.CSPDownBlock(luxonis_train.nodes.necks.reppan_neck.blocks.PANDownBlockBase)
method
__init__(self, in_channels: int, downsample_out_channels: int, in_channels_next: int, out_channels: int, n_repeats: int, e: float)
RepPANNeck up block for larger networks that uses CSPStackRepBlock. @type in_channels: int @param in_channels: Number of input channels. @type downsample_out_channels: int @param downsample_out_channels: Number of output channels after downsample. @type in_channels_next: int @param in_channels_next: Number of input channels of next input which is used in concat. @type out_channels: int @param out_channels: Number of output channels. @type n_repeats: int @param n_repeats: Number of RepVGGBlock repeats. @type e: float @param e: Factor that controls number of intermediate channels.
property
module
luxonis_train.nodes.necks.reppan_neck.variants
class
luxonis_train.nodes.necks.reppan_neck.variants.RepPANNeckVariant(pydantic.BaseModel)
variable
variable
variable
variable
class
luxonis_train.nodes.necks.reppan_neck.RepPANNeck(luxonis_train.nodes.base_node.BaseNode)
variable
variable
variable
method
__init__(self, variant: VariantLiteral = 'nano', n_heads: Literal
[
2
,
3
,
4
] = 3, channels_list: list
[
int
]
|
None = None, n_repeats: list
[
int
]
|
None = None, depth_mul: float
|
None = None, width_mul: float
|
None = None, block: Literal
[
'
RepBlock
'
,
'
CSPStackRepBlock
'
]
|
None = None, csp_e: float
|
None = None, download_weights: bool = False, initialize_weights: bool = True, kwargs)
Implementation of the RepPANNeck module. Supports the version with RepBlock and CSPStackRepBlock (for larger networks) Adapted from U{YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications<https://arxiv.org/pdf/2209.02976.pdf>}. It has the balance of feature fusion ability and hardware efficiency. @type variant: Literal["n", "nano", "s", "small", "m", "medium", "l", "large"] @param variant: RepPANNeck variant. Defaults to "nano". The variant determines the depth and width multipliers, block used and intermediate channel scaling factor. The depth multiplier determines the number of blocks in each stage and the width multiplier determines the number of channels. The following variants are available: - "n" or "nano" (default): depth_multiplier=0.33, width_multiplier=0.25, block=RepBlock, e=None - "s" or "small": depth_multiplier=0.33, width_multiplier=0.50, block=RepBlock, e=None - "m" or "medium": depth_multiplier=0.60, width_multiplier=0.75, block=CSPStackRepBlock, e=2/3 - "l" or "large": depth_multiplier=1.0, width_multiplier=1.0, block=CSPStackRepBlock, e=1/2 @type n_heads: Literal[2,3,4] @param n_heads: Number of output heads. Defaults to 3. B{Note: Should be same also on head in most cases.} @type channels_list: list[int] | None @param channels_list: List of number of channels for each block. Defaults to C{[256, 128, 128, 256, 256, 512]}. @type n_repeats: list[int] | None @param n_repeats: List of number of repeats of RepVGGBlock. Defaults to C{[12, 12, 12, 12]}. @type depth_mul: float @param depth_mul: Depth multiplier. Defaults to C{0.33}. @type width_mul: float @param width_mul: Width multiplier. Defaults to C{0.25}. @type block: Literal["RepBlock", "CSPStackRepBlock"] | None @param block: Base block used when building the backbone. If provided, overrides the variant value. @tpe csp_e: float | None @param csp_e: Factor that controls number of intermediate channels if block="CSPStackRepBlock". If provided, overrides the variant value. @type download_weights: bool @param download_weights: If True download weights from COCO (if available for specified variant). Defaults to False. @type initialize_weights: bool @param initialize_weights: If True, initialize weights of the model.
variable
variable
variable
method
method
method
set_export_mode(self, mode: bool = True)
Reparametrizes instances of L{RepVGGBlock} in the network. @type mode: bool @param mode: Whether to set the export mode. Defaults to C{True}.
module
luxonis_train.nodes.necks.svtr_neck.blocks
class
luxonis_train.nodes.necks.svtr_neck.blocks.Im2Seq(torch.nn.Module)
class
luxonis_train.nodes.necks.svtr_neck.blocks.Mlp(torch.nn.Module)
class
luxonis_train.nodes.necks.svtr_neck.blocks.ConvMixer(torch.nn.Module)
method
variable
variable
variable
variable
method
class
luxonis_train.nodes.necks.svtr_neck.blocks.Attention(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.necks.svtr_neck.blocks.SVTRBlock(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.necks.svtr_neck.blocks.EncoderWithSVTR(torch.nn.Module)
method
variable
variable
variable
variable
variable
variable
variable
variable
variable
variable
method
class
luxonis_train.nodes.necks.svtr_neck.SVTRNeck(luxonis_train.nodes.base_node.BaseNode)
variable
method
__init__(self, kwargs)
Initializes the SVTR neck. @see: U{Adapted from <https://github.com/PaddlePaddle/PaddleOCR/ blob/main/ppocr/modeling/necks/rnn.py>} @see: U{Original code <https://github.com/PaddlePaddle/PaddleOCR>} @license: U{Apache License, Version 2.0 <https://github.com/PaddlePaddle/PaddleOCR/blob/main/LICENSE >} @see: U{Adapted from <https://github.com/PaddlePaddle/PaddleOCR/ blob/main/ppocr/modeling/necks/rnn.py>} @see: U{Original code <https://github.com/PaddlePaddle/PaddleOCR>} @license: U{Apache License, Version 2.0 <https://github.com/PaddlePaddle/PaddleOCR/blob/main/LICENSE >}
variable
variable
variable
method
package
luxonis_train.optimizers
module
module
luxonis_train.registry
constant
constant
constant
constant
constant
constant
constant
constant
constant
constant
type variable
function
from_registry(registry: Registry
[
type
[
T
]
], key: str, args, kwargs) -> T: T
Get an instance of the class registered under the given key. @type registry: Registry[type[T]] @param registry: Registry to get the class from. @type key: str @param key: Key to get the class for. @rtype: T @return: Instance of the class registered under the given key.
package
luxonis_train.schedulers
module
package
luxonis_train.strategies
module
luxonis_train.strategies.triple_lr_sgd
class
luxonis_train.strategies.triple_lr_sgd.TripleLRSGD
variable
variable
variable
variable
variable
method
class
luxonis_train.strategies.triple_lr_sgd.TripleLRSGDStrategy(luxonis_train.strategies.base_strategy.BaseTrainingStrategy)
method
__init__(self, pl_module: lxt.LuxonisLightningModule, lr: float = 0.02, momentum: float = 0.937, weight_decay: float = 0.0005, nesterov: bool = True, warmup_epochs: int = 3, warmup_bias_lr: float = 0.1, warmup_momentum: float = 0.8, lre: float = 0.0002, cosine_annealing: bool = True)
TripleLRSGD strategy. @type pl_module: pl.LightningModule @param pl_module: The pl_module to be used. @type params: dict @param params: The parameters for the strategy. Those are: - lr: The learning rate. - momentum: The momentum. - weight_decay: The weight decay. - nesterov: Whether to use nesterov. - warmup_epochs: The number of warmup epochs. - warmup_bias_lr: The warmup bias learning rate. - warmup_momentum: The warmup momentum. - lre: The learning rate for the end of the training. - cosine_annealing: Whether to use cosine annealing.
variable
variable
variable
variable
method
method
class
luxonis_train.strategies.BaseTrainingStrategy(abc.ABC)
class
luxonis_train.strategies.TripleLRScheduler
variable
variable
variable
variable
variable
variable
variable
variable
method
variable
variable
variable
variable
method
method
module
luxonis_train.tasks
class
class
class
class
class
class
class
class
class
class
class
class
class
class
class
class
luxonis_train.tasks.staticproperty
class
luxonis_train.tasks.Metadata
variable
variable
method
method
method
method
class
luxonis_train.tasks.Task(abc.ABC)
class
luxonis_train.tasks.Classification(luxonis_train.tasks.Task)
class
luxonis_train.tasks.Segmentation(luxonis_train.tasks.Task)
class
luxonis_train.tasks.InstanceBaseTask(luxonis_train.tasks.Task)
property
class
luxonis_train.tasks.BoundingBox(luxonis_train.tasks.InstanceBaseTask)
method
class
luxonis_train.tasks.InstanceSegmentation(luxonis_train.tasks.InstanceBaseTask)
class
luxonis_train.tasks.InstanceKeypoints(luxonis_train.tasks.InstanceBaseTask)
class
luxonis_train.tasks.Keypoints(luxonis_train.tasks.Task)
class
luxonis_train.tasks.Fomo(luxonis_train.tasks.InstanceBaseTask)
class
luxonis_train.tasks.Embeddings(luxonis_train.tasks.Task)
class
luxonis_train.tasks.AnomalyDetection(luxonis_train.tasks.Task)
class
luxonis_train.tasks.Ocr(luxonis_train.tasks.Task)
class
luxonis_train.tasks.Tasks
property
property
property
property
property
property
property
property
property
property
module
luxonis_train.typing
type alias
Labels: TypeAlias
Labels is a dictionary mapping task names to tensors.
type alias
AttachIndexType: TypeAlias
AttachIndexType is used to specify to which output of the prevoius node does the current node attach to. It can be either "all" (all outputs), an index of the output or a tuple of indices of the output (specifying a range of outputs).
type variable
type alias
Packet: TypeAlias
Packet is a dictionary containing either a single instance of a list of either `torch.Tensor`s or `torch.Size`s. Packets are used to pass data between nodes of the network graph.
package
luxonis_train.utils
module
module
module
module
module
module
module
module
function
anchors_for_fpn_features(features: list
[
Tensor
], strides: Tensor, grid_cell_size: float = 5.0, grid_cell_offset: float = 0.5, multiply_with_stride: bool = False) -> tuple[Tensor, Tensor, list[int], Tensor]: tuple[Tensor, Tensor, list[int], Tensor]
Generates anchor boxes, points and strides based on FPN feature shapes and strides. @type features: list[Tensor] @param features: List of FPN features. @type strides: Tensor @param strides: Strides of FPN features. @type grid_cell_size: float @param grid_cell_size: Cell size in respect to input image size. Defaults to 5.0. @type grid_cell_offset: float @param grid_cell_offset: Percent grid cell center's offset. Defaults to 0.5. @type multiply_with_stride: bool @param multiply_with_stride: Whether to multiply per FPN values with its stride. Defaults to False. @rtype: tuple[Tensor, Tensor, list[int], Tensor] @return: BBox anchors, center anchors, number of anchors, strides
function
apply_bounding_box_to_masks(masks: Tensor, bounding_boxes: Tensor) -> Tensor: Tensor
Crops the given masks to the regions specified by the corresponding bounding boxes. @type masks: Tensor @param masks: Masks tensor of shape [n, h, w]. @type bounding_boxes: Tensor @param bounding_boxes: Bounding boxes tensor of shape [n, 4]. @rtype: Tensor @return: Cropped masks tensor of shape [n, h, w].
function
bbox2dist(bbox: Tensor, anchor_points: Tensor, reg_max: float) -> Tensor: Tensor
Transform bbox(xyxy) to distance(ltrb). @type bbox: Tensor @param bbox: Bboxes in "xyxy" format @type anchor_points: Tensor @param anchor_points: Head's anchor points @type reg_max: float @param reg_max: Maximum regression distances @rtype: Tensor @return: BBoxes in distance(ltrb) format
function
bbox_iou(bbox1: Tensor, bbox2: Tensor, bbox_format: BBoxFormatType = 'xyxy', iou_type: IoUType = 'none', element_wise: bool = False) -> Tensor: Tensor
Computes IoU between two sets of bounding boxes. @type bbox1: Tensor @param bbox1: First set of bboxes [N, 4]. @type bbox2: Tensor @param bbox2: Second set of bboxes [M, 4]. @type bbox_format: BBoxFormatType @param bbox_format: Input bounding box format. Defaults to C{"xyxy"}. @type iou_type: Literal["none", "giou", "diou", "ciou", "siou"] @param iou_type: IoU type. Defaults to "none". Possible values are: - "none": standard IoU - "giou": Generalized IoU - "diou": Distance IoU - "ciou": Complete IoU. Introduced in U{ Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation<https://arxiv.org/pdf/2005.03572.pdf>}. Implementation adapted from torchvision C{complete_box_iou} with improved stability. - "siou": Soft IoU. Introduced in U{ SIoU Loss: More Powerful Learning for Bounding Box Regression<https://arxiv.org/pdf/2205.12740.pdf>}. @type element_wise: bool @param element_wise: If True returns element wise IoUs. Defaults to False. @rtype: Tensor @return: IoU between bbox1 and bbox2. If element_wise is True returns [N, M] tensor, otherwise returns [N] tensor.
function
compute_iou_loss(pred_bboxes: Tensor, target_bboxes: Tensor, target_scores: Tensor
|
None = None, mask_positive: Tensor
|
None = None, iou_type: IoUType = 'giou', bbox_format: BBoxFormatType = 'xyxy', reduction: Literal
[
'
sum
'
,
'
mean
'
] = 'mean') -> tuple[Tensor, Tensor]: tuple[Tensor, Tensor]
Computes an IoU loss between 2 sets of bounding boxes. @type pred_bboxes: Tensor @param pred_bboxes: Predicted bounding boxes. @type target_bboxes: Tensor @param target_bboxes: Target bounding boxes. @type target_scores: Tensor | None @param target_scores: Target scores. Defaults to None. @type mask_positive: Tensor | None @param mask_positive: Mask for positive samples. Defaults to None. @type iou_type: L{IoUType} @param iou_type: IoU type. Defaults to "giou". @type bbox_format: L{BBoxFormatType} @param bbox_format: BBox format. Defaults to "xyxy". @type reduction: Literal["sum", "mean"] @param reduction: Reduction type. Defaults to "mean". @rtype: tuple[Tensor, Tensor] @return: IoU loss and IoU values.
function
dist2bbox(distance: Tensor, anchor_points: Tensor, out_format: BBoxFormatType = 'xyxy', dim: int = -1) -> Tensor: Tensor
Transform distance (ltrb) to box ("xyxy", "xywh" or "cxcywh"). @type distance: Tensor @param distance: Distance predictions @type anchor_points: Tensor @param anchor_points: Head's anchor points @type out_format: BBoxFormatType @param out_format: BBox output format. Defaults to "xyxy". @rtype: Tensor @param dim: Dimension to split distance tensor. Defaults to -1. @rtype: Tensor @return: BBoxes in correct format
function
keypoints_to_bboxes(keypoints: list
[
Tensor
], img_height: int, img_width: int, box_width: int = 5, visibility_threshold: float = 0.5) -> list[Tensor]: list[Tensor]
Convert keypoints to bounding boxes in xyxy format with cls_id and score, filtering low-visibility keypoints. @type keypoints: list[Tensor] @param keypoints: List of tensors of keypoints with shape [N, 1, 4] (x, y, v, cls_id). @type img_height: int @param img_height: Height of the image. @type img_width: int @param img_width: Width of the image. @type box_width: int @param box_width: Width of the bounding box in pixels. Defaults to 2. @type visibility_threshold: float @param visibility_threshold: Minimum visibility score to include a keypoint. Defaults to 0.5. @rtype: list[Tensor] @return: List of tensors of bounding boxes with shape [N, 6] (x_min, y_min, x_max, y_max, score, cls_id).
function
non_max_suppression(preds: Tensor, n_classes: int, conf_thres: float = 0.25, iou_thres: float = 0.45, keep_classes: list
[
int
]
|
None = None, agnostic: bool = False, multi_label: bool = False, bbox_format: BBoxFormatType = 'xyxy', max_det: int = 300, predicts_objectness: bool = True) -> list[Tensor]: list[Tensor]
Non-maximum suppression on model's predictions to keep only best instances. @type preds: Tensor @param preds: Model's prediction tensor of shape [bs, N, M]. @type n_classes: int @param n_classes: Number of model's classes. @type conf_thres: float @param conf_thres: Boxes with confidence higher than this will be kept. Defaults to 0.25. @type iou_thres: float @param iou_thres: Boxes with IoU higher than this will be discarded. Defaults to 0.45. @type keep_classes: list[int] | None @param keep_classes: Subset of classes to keep, if None then keep all of them. Defaults to None. @type agnostic: bool @param agnostic: Whether perform NMS per class or treat all classes the same. Defaults to False. @type multi_label: bool @param multi_label: Whether one prediction can have multiple labels. Defaults to False. @type bbox_format: BBoxFormatType @param bbox_format: Input bbox format. Defaults to "xyxy". @type max_det: int @param max_det: Number of maximum output detections. Defaults to 300. @type predicts_objectness: bool @param predicts_objectness: Whether head predicts objectness confidence. Defaults to True. @rtype: list[Tensor] @return: list of kept detections for each image, boxes in "xyxy" format. Tensors with shape [n_kept, M]
class
DatasetMetadata
Metadata about the dataset.
exception
IncompatibleError
Raised when two parts of the model are incompatible with each other.
function
get_attribute_check_none(obj: object, attribute: str) -> Any: Any
Get private attribute from object and check if it is not None. Example: >>> class Person: ... def __init__(self, age: int | None = None): ... self._age = age ... ... @property ... def age(self): ... return get_attribute_check_none(self, "age") >>> mike = Person(20) >>> print(mike.age) 20 >>> amanda = Person() >>> print(amanda.age) Traceback (most recent call last): ValueError: attribute 'age' was not set @type obj: object @param obj: Object to get attribute from. @type attribute: str @param attribute: Name of the attribute to get. @rtype: Any @return: Value of the attribute. @raise ValueError: If the attribute is None.
function
get_batch_instances(batch_index: int, bboxes: Tensor, payload: Tensor
|
None = None) -> Tensor: Tensor
Get instances from batched data, where the batch index is encoded. as the first column of the bounding boxes. @type batch_index: int @param batch_index: Batch index. @type bboxes: Tensor @param bboxes: Tensor of bounding boxes. Must have the batch index as the first column. @type payload: Tensor | None @param payload: Additional tensor to be batched with the bounding boxes. This tensor is in the same batch order, but doesn't contain the batch index itself. If unset, returns the bounding box instances (without the batch index). @rtype: Tensor @return: Instances from the batched data.
function
get_with_default(value: T
|
None, action_name: str, caller_name: str
|
None = None, default: T) -> T: T
Returns value if it is not C{None}, otherwise returns the default value and log an info. @type value: T | None @param value: Value to return. @type action_name: str @param action_name: Name of the action for which the default value is being used. Used for logging. @type caller_name: str | None @param caller_name: Name of the caller function. Used for logging. @type default: T @param default: Default value to return if C{value} is C{None}. @rtype: T @return: C{value} if it is not C{None}, otherwise C{default}.
function
infer_upscale_factor(in_size: tuple
[
int
,
int
]
|
int, orig_size: tuple
[
int
,
int
]
|
int) -> int: int
Infer the upscale factor from the input shape and the original shape. @type in_size: tuple[int, int] | int @param in_size: Input shape as a tuple of (height, width) or just one of them. @type orig_size: tuple[int, int] | int @param orig_size: Original shape as a tuple of (height, width) or just one of them. @rtype: int @return: Upscale factor. @raise ValueError: If the C{in_size} cannot be upscaled to the C{orig_size}. This can happen if the upscale factors are not integers or are different.
function
instances_from_batch(bboxes: Tensor, args: Tensor, batch_size: int
|
None = None) -> Iterator[tuple[Tensor, ...]]|Iterator[Tensor]: Iterator[tuple[Tensor, ...]]|Iterator[Tensor]
Generate instances from batched data, where the batch index is encoded as the first column of the bounding boxes. Example:: >>> bboxes = torch.tensor([[0, 1], [0, 2], [1, 3]]) >>> keypoints = torch.tensor([[0.1], [0.2], [0.3]]) >>> for bbox, kpt in instances_from_batch(bboxes, keypoints): ... print(bbox, kpt) tensor([[1], [2]]) tensor([[0.1], [0.2]]) tensor([[3]]) tensor([[0.3]]) @type bboxes: Tensor @param bboxes: Tensor of bounding boxes. Must have the batch index as the first column. @type *args: Tensor @param *args: Additional tensors to be batched with the bounding boxes. These tensors are in the same batch order, but don't contain the batch index themselves. @type batch_size: int @param batch_size: The batch size. Important in case of empty tensors. If provided and the tensors are empty, the generator will yield C{batch_size} empty tensors. If not provided, the generator will yield nothing. Defaults to C{None}. @rtype: Iterator[tuple[Tensor, ...]] @return: Generator of instances, where the first element is the bounding box tensor (with the batch index stripped) and the rest are the additional tensors (keypoints, masks, etc.).
function
make_divisible(x: float, divisor: int) -> int: int
Upward revision the value x to make it evenly divisible by the divisor. Equivalent to M{ceil(x / divisor) * divisor}. @type x: int | float @param x: Value to be revised. @type divisor: int @param divisor: Divisor. @rtype: int @return: Revised value.
function
safe_download(url: str, file: str
|
None = None, dir: PathType = '.cache/luxonis_train', retry: int = 3, force: bool = False) -> Path|None: Path|None
Downloads file from the web and returns either local path or None if downloading failed. @type url: str @param url: URL of the file you want to download. @type file: str | None @param file: Name of the saved file, if None infers it from URL. Defaults to None. @type dir: str @param dir: Directory to store downloaded file in. Defaults to '.cache_data'. @type retry: int @param retry: Number of retries when downloading. Defaults to 3. @type force: bool @param force: Whether to force redownload if file already exists. Defaults to False. @rtype: Path | None @return: Path to local file or None if downloading failed.
function
to_shape_packet(packet: Packet
[
Tensor
]) -> Packet[Size]: Packet[Size]
Converts a packet of tensors to a packet of shapes. Used for debugging purposes. @type packet: Packet[Tensor] @param packet: Packet of tensors. @rtype: Packet[Size] @return: Packet of shapes.
function
compute_pose_oks(predictions: Tensor, targets: Tensor, sigmas: Tensor, gt_bboxes: Tensor
|
None = None, pose_area: Tensor
|
None = None, eps: float = 1e-09, area_factor: float = 0.53, use_cocoeval_oks: bool = True) -> Tensor: Tensor
Compute batched Object Keypoint Similarity (OKS) between ground truth and predicted keypoints. @type pred_kpts: Tensor @param pred_kpts: Predicted keypoints with shape [N, M2, n_keypoints, 3] @type gt_kpts: Tensor @param gt_kpts: Ground truth keypoints with shape [N, M1, n_keypoints, 3] @type sigmas: Tensor @param sigmas: Sigmas for each keypoint, shape [n_keypoints] @type gt_bboxes: Tensor @param gt_bboxes: Ground truth bounding boxes in XYXY format with shape [N, M1, 4] @type pose_area: Tensor @param pose_area: Area of the pose, shape [N, M1, 1, 1] @type eps: float @param eps: A small constant to ensure numerical stability @type area_factor: float @param area_factor: Factor to scale the area of the pose @type use_cocoeval_oks: bool @param use_cocoeval_oks: Whether to use the same OKS formula as in COCOEval or use the one from the definition. Defaults to True. @rtype: Tensor @return: A tensor of OKS values with shape [N, M1, M2]
function
get_center_keypoints(bboxes: Tensor, height: int = 1, width: int = 1) -> Tensor: Tensor
Get center keypoints from bounding boxes. @type bboxes: Tensor @param bboxes: Tensor of bounding boxes. @type height: int @param height: Image height. Defaults to C{1} (normalized). @type width: int @param width: Image width. Defaults to C{1} (normalized). @rtype: Tensor @return: Tensor of center keypoints.
function
get_sigmas(sigmas: list
[
float
]
|
None, n_keypoints: int, caller_name: str
|
None = None) -> Tensor: Tensor
Validate or create sigma values for each keypoint. @type sigmas: list[float] | None @param sigmas: List of sigmas for each keypoint. If C{None}, then default sigmas are used. @type n_keypoints: int @param n_keypoints: Number of keypoints. @type caller_name: str | None @param caller_name: Name of the caller function. Used for logging. @rtype: Tensor @return: Tensor of sigmas.
function
insert_class(keypoints: Tensor, bboxes: Tensor) -> Tensor: Tensor
Insert class index into keypoints tensor. @type keypoints: Tensor @param keypoints: Tensor of keypoints. @type bboxes: Tensor @param bboxes: Tensor of bounding boxes with class index. @rtype: Tensor @return: Tensor of keypoints with class index.
function
class
OCRDecoder
OCR decoder for converting model predictions to text.
class
OCREncoder
OCR encoder for converting text to model targets.
class
LuxonisTrackerPL
Implementation of LuxonisTracker that is compatible with PytorchLightning.
module
luxonis_train.utils.boundingbox
module
luxonis_train.utils.general
type variable
function
clean_url(url: str) -> str: str
Strip auth from URL, i.e. https://url.com/file.txt?auth -> https://url.com/file.txt.
function
url2file(url: str) -> str: str
Convert URL to filename, i.e. https://url.com/file.txt?auth -> file.txt.
class
luxonis_train.utils.DatasetMetadata
method
__init__(self, classes: dict
[
str
,
dict
[
str
,
int
]
]
|
None = None, n_keypoints: dict
[
str
,
int
]
|
None = None, metadata_types: dict
[
str
,
(
type
[
int
]
|
type
[
Category
]
|
type
[
float
]
|
type
[
str
]
)
]
|
None = None, loader: BaseLoaderTorch
|
None = None)
An object containing metadata about the dataset. Used to infer the number of classes, number of keypoints, I{etc.} instead of passing them as arguments to the model. @type classes: dict[str, dict[str, int]] | None @param classes: Dictionary mapping tasks to the classes. @type n_keypoints: dict[str, int] | None @param n_keypoints: Dictionary mapping tasks to the number of keypoints. @type loader: DataLoader | None @param loader: Dataset loader.
property
task_names
Gets the names of the tasks present in the dataset.
method
n_classes(self, task_name: str
|
None = None) -> int: int
Gets the number of classes for the specified task. @type task_name: str | None @param task_name: Task to get the number of classes for. @rtype: int @return: Number of classes for the specified task type. @raises ValueError: If the C{task} is not present in the dataset. @raises RuntimeError: If the C{task} was not provided and the dataset contains different number of classes for different task types.
method
n_keypoints(self, task_name: str
|
None = None) -> int: int
Gets the number of keypoints for the specified task. @type task_name: str | None @param task_name: Task to get the number of keypoints for. @rtype: int @return: Number of keypoints for the specified task type. @raises ValueError: If the C{task} is not present in the dataset. @raises RuntimeError: If the C{task} was not provided and the dataset contains different number of keypoints for different task types.
method
classes(self, task_name: str
|
None = None) -> bidict[str, int]: bidict[str, int]
Gets the class names for the specified task. @type task_name: str | None @param task_name: Task to get the class names for. @rtype: list[str] @return: List of class names for the specified task type. @raises ValueError: If the C{task} is not present in the dataset. @raises RuntimeError: If the C{task} was not provided and the dataset contains different class names for different label types.
property
metadata_types
Gets the types of metadata for the dataset.
CLASS_METHOD
from_loader
Creates a L{DatasetMetadata} object from a L{LuxonisDataset}. @type loader: LuxonisDataset @param loader: Loader to read the metadata from. @rtype: DatasetMetadata @return: Instance of L{DatasetMetadata} created from the provided dataset.
class
luxonis_train.utils.OCRDecoder
method
__init__(self, char_to_int: dict, ignored_tokens: list
[
int
]
|
None = None, is_remove_duplicate: bool = True)
Initializes the OCR decoder. @type char_to_int: dict @param char_to_int: A dictionary mapping characters to integers. @type ignored_tokens: list[int] @param ignored_tokens: A list of tokens to ignore when decoding. Defaults to [0]. @type is_remove_duplicate: bool @param is_remove_duplicate: Whether to remove duplicate characters. Defaults to True.
variable
variable
variable
method
decode(self, preds: Tensor) -> list[tuple[str, float]]: list[tuple[str, float]]
Decodes the model predictions to text. @type preds: Tensor @param preds: A tensor containing the model predictions. @rtype: list[tuple[str, float]] @return: A list of tuples containing the decoded text and confidence score.
method
class
luxonis_train.utils.OCREncoder
method
__init__(self, alphabet: list
[
str
], ignore_unknown: bool = True)
Initializes the OCR encoder. @type alphabet: list[str] @param alphabet: A list of characters in the alphabet. @type ignore_unknown: bool @param ignore_unknown: Whether to ignore unknown characters. Defaults to True.
variable
variable
method
encode(self, targets: Tensor) -> Tensor: Tensor
Encodes the text targets to model targets. @type targets: list[int] @param targets: A list of text targets. @rtype: Tensor @return: A tensor containing the encoded targets.
method
property
property
class
luxonis_train.utils.LuxonisTrackerPL(luxonis_ml.tracker.LuxonisTracker, lightning.pytorch.loggers.logger.Logger)
method
__init__(self, _auto_finalize: bool = True, kwargs)
@type _auto_finalize: bool @param _auto_finalize: If True, the run will be finalized automatically when the training ends. If set to C{False}, the user will have to call the L{_finalize} method manually. @type kwargs: dict @param kwargs: Additional keyword arguments to be passed to the L{LuxonisTracker}.
variable