Inference
Overview
- Built-in nodes (run directly on Luxonis devices), and
- Host nodes (run on the host).
If the model of choice is not converted for a desired RVC platform, please refer to the Conversion section.
Installation
pip:Command Line
1pip install depthai --force-reinstall
2pip install depthai-nodesInference Pipeline
- Camera,
- Model and Parser(s);
- Queue(s);
- Results.
Python
1import depthai as dai
2from depthai_nodes.node import ParsingNeuralNetwork
3
4model = "..." # NN Archive or HubAI model identifier
5
6# Create pipeline
7with dai.Pipeline() as pipeline:
8
9 # Camera
10 camera = pipeline.create(dai.node.Camera).build()
11
12 # Model and Parser(s)
13 nn_with_parser = pipeline.create(ParsingNeuralNetwork).build(
14 camera, model
15 )
16
17 # Queue(s)
18 parser_output_queue = nn_with_parser.out.createOutputQueue()
19
20 # Start pipeline
21 pipeline.start()
22
23 while pipeline.isRunning():
24
25 # Results
26 ...Aside from defining the HubAI model identifier, the template above should work out-of-box.Beware, however, that some OAK devices have internal FPS limitations (e.g. OAK-D Lite).You can set the FPS limit as
pipeline.create(ParsingNeuralNetwork).build(... fps=<limit>).Camera
Python
1camera_node = pipeline.create(dai.node.Camera).build()Model and Parser(s)
- Automatically, using the
ParsingNeuralNetworknode, or - Manually, initializing them as independent nodes and linking them together.
Automatic Setup
ParsingNeuralNetwork node extends the standard NeuralNetwork node by adding automatic parsing capabilities for model outputs. It can be imported from the depthai_nodes package as:Python
1from depthai_nodes.node import ParsingNeuralNetworkPython
1# Set up NN Archive
2nn_archive = dai.NNArchive(<path/to/NNArchiveName.tar.xz>)
3
4# Set up model (with parser(s)) and link it to camera output
5nn_with_parser = pipeline.create(ParsingNeuralNetwork).build(
6 cameraNode, nn_archive
7)Python
1# Set up the HubAI model identifier
2model = "..."
3
4# Set up model with parser(s)
5nn_with_parser = pipeline.create(ParsingNeuralNetwork).build(
6 camera_node, model
7)If you plan to use a private HubAI model, make sure to configure your Luxonis Hub API Key. Instructions are available on the API Key Good Practices page. Once set up correctly, the API key will be applied automatically to authenticate your requests with the HubAI platform.
Manual Setup
DepthAI Nodes parser of interest or implement a parser of your own.Python
1from depthai_nodes.node import <ParserNode>
2# OR:
3class ParserNode(dai.node.ThreadedHostNode):
4 def __init__(self) -> None:
5 super().__init__()
6 self.input = self.createInput()
7 self.out = self.createOutput()
8 def build(self) -> "ParserNode":
9 return self
10 def run(self) -> None:
11 nn_out_raw = self.input.get()
12 nn_out_processed = ... # custom post-processing
13 self.out.send(nn_out_processed)create() method on the pipeline:Python
1model = pipeline.create(dai.node.NeuralNetwork)
2parser = pipeline.create(<ParserNode>)- at initialization, configuration can be set by passing the parameter values as arguments to the
create()method:parser = pipeline.create(<ParserNode>, <ParameterName>=<ParameterValue>, ...)If configuring multiple parameters, you can arrange them into adictand pass it as an argument to the parserbuild()method:parser = pipeline.create(<ParserNode>).build(config_dict); - after initialization, one can change configuration by using the setter methods:
parser.<SetterMethodName>(<ParameterValue>). You can find all the setter methods available for a specific parser on the DepthAI Nodes API Reference page.
.blob for RVC2, or the .dlc file for RVC4):Python
1model.setModelPath(<path/to/model_executable>)Python
1width, height = ... # model input size
2camera_stream = camera.requestOutput(size=(width, height))
3camera_stream.link(model.input)
4model.out.link(parser.input)If interested in building more advanced parsers—similar to our native ones that automatically process NN Archives for setup—check out the parsers section of the DepthAI Nodes library. There, you can explore how we've implemented them in practice.
Queue(s)
Python
1frame_queue = nn_with_parser.passthrough.createOutputQueue()Single-Headed
Python
1parser_output_queue = nn_with_parser.out.createOutputQueue()Multi-Headed
Python
1head0_parser_output_queue = nn_with_parser.getOutput(0).createOutputQueue()
2head1_parser_output_queue = nn_with_parser.getOutput(1).createOutputQueue()
3...Results
pipeline.start(), outputs can be obtained from the defined queue(s). You can obtain the input frame and parsed model outputs as:Python
1while pipeline.isRunning():
2
3 # Get Camera Output
4 frame_queue_output = frame_queue.get()
5 frame = frame_queue_output.getCvFrame()
6 ...
7
8 # Get Parsed Output(s)
9 parser_output = parser_output_queue.get()
10 ...- generic DepthAI messages, or
- custom-written DepthAI Nodes messages
Examples
Troubleshooting
Setting Model SHAVEs
Command Line
1NeuralNetwork: Blob compiled for ... shaves, but only ... are available in current configurationCommand Line
1[14442C103180EECF00] [2.1] [4.736] [NeuralNetwork(2)] [warning] Network compiled for 8 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance- if model was exported with legacy Blobconverter, you can recompile the model with a matching number of SHAVEs, or
- if the model was exported within HubAI recompilation is not needed - you can set the number of SHAVEs at pipeline initialization to match the device requirements:
Python
1nn_archive = dai.NNArchive(...)
2nn_with_parser = pipeline.create(ParsingNeuralNetwork).build(
3 ..., nn_archive
4)
5# Set the number of SHAVEs
6nn_with_parser.setNNArchive(
7 nn_archive, numShaves=<Number>
8)The old
SHAVE configuration method is no longer supported in DepthAI v3.Avoid using:Python
1nn = pipeline.create(dai.node.NeuralNetwork)
2nn.setNumShaves(6)Changing Parser Parameters
- Pipelines with a separate parser node: Simply access the parser node directly.
- Pipelines with a
ParsingNeuralNetworknode: In this case, the parser is integrated with the AI model. Retrieve it by calling the.getParser()method on theParsingNeuralNetworknode.
Python
1parser.setConfThreshold(0.5)Different Visualization and Model Input Sizes
ImageManip node to resize the image before sending it to the model, while keeping the original resolution for display.Example:Python
1cam = pipeline.create(dai.node.Camera).build()
2
3# Request specific image size for capture
4cam_out = cam.requestOutput(size=(<width1>, <height1>))
5
6# Create and configure resize node for model input
7resize_node = pipeline.create(dai.node.ImageManip)
8resize_node.initialConfig.setOutputSize(<width2>, <height2>)
9cam_out.link(resize_node.inputImage)
10
11# Define model with resized input
12nn_with_parser: ParsingNeuralNetwork = pipeline.create(ParsingNeuralNetwork).build(
13 resize_node.out, ...
14)
15
16# Visualize Using Original Resolution
17video_queue = cam_out.out.createOutputQueue() # high-res stream
18detection_queue = nn_with_parser.out.createOutputQueue() # detections on the low-res stream
19...Model Location When Downloaded from HubAI
.depthai_cached_models folder at the project root. This cache contains all models from previous runs. If a model is already cached, it’s loaded locally instead of re-downloading. To force a fresh download, can use the useCached=False parameter when downloading the model. Example:Python
1nn_archive = dai.NNArchive(dai.getModelFromZoo(model_description, useCached=False))
2nn_with_parser = pipeline.create(ParsingNeuralNetwork).build(
3 ..., nn_archive
4).depthai_cached_models folder and re-run the pipeline.Further Reading
Concurrent Model Execution on RVC4
Resource allocation on HTP
- HTP compute cores and V-TCM memory are shared elastically across all concurrent SNPE sessions.
- The internal scheduler uses a round-robin strategy; the split of the resources is not fixed between multiple models and can vary frame-to-frame.
Control knobs
- Resource steering is not supported. There is no per-model priority, core-affinity or quota API. The only global lever is the
--perf_profileflag which affects power/perf trade-offs at the SoC level.
Practical guidance
- Assume latency and throughput will fluctuate when new/concurrent sessions start or stop.
- Use SNPE timing logs such as layer-wise profiling, to measure end-to-end latency instead of guessing resource shares.