Low Latency

These tables show what performance you can expect from USB 3.2 Gen 1 (5 Gbps) connection with an OAK camera. XLink chunking was disabled for these tests (pipeline.setXLinkChunkSize(0)). For an example code, see Latency measurement.

What

Resolution

FPS

FPS set

Time-to-Host [ms]

Bandwidth

Histogram

Color (isp)

1080P

60

60

33

1.5 Gbps

link

Color (isp)

4K

28.5

30

150

2.8 Gbps

link

Mono

720P/800P

120

120

24.5

442/482 Mbps

link

Mono

400P

120

120

7.5

246 Mbps

link

  • Time-to-Host is measured time between frame timestamp (imgFrame.getTimestamp()) and host timestamp when the frame is received (dai.Clock.now()).

  • Histogram shows how much Time-to-Host varies frame to frame. Y axis represents number of frame that occured at that time while the X axis represents microseconds.

  • Bandwidth is calculated bandwidth required to stream specified frames at specified FPS.

Encoded frames

What

Resolution

FPS

FPS set

Time-to-Host [ms]

Histogram

Color video H.265

4K

28.5

30

210

link

Color video MJPEG

4K

30

30

71

link

Color video H.265

1080P

60

60

42

link

Color video MJPEG

1080P

60

60

31

link

Mono H.265

800P

60

60

23.5

link

Mono MJPEG

800P

60

60

22.5

link

Mono H.265

400P

120

120

7.5

link

Mono MJPEG

400P

120

120

7.5

link

You can also reduce frame latency by using Zero-Copy branch of the DepthAI. This will pass pointers (at XLink level) to cv2.Mat instead of doing memcopy (as it currently does), so performance improvement would depend on the image sizes you are using. (Note: API differs and not all functionality is available as is on the message_zero_copy branch)

Reducing latency when running NN

In the examples above we were only streaming frames, without doing anything else on the OAK camera. This section will focus on how to reduce latency when also running NN model on the OAK.

Lowering camera FPS to match NN FPS

Lowering FPS to not exceed NN capabilities typically provides the best latency performance, since the NN is able to start the inference as soon as a new frame is available.

For example, with 15 FPS we get a total of about 70 ms latency, measured from capture time (end of exposure and MIPI readout start).

This time includes the following:

  • MIPI readout

  • ISP processing

  • Preview post-processing

  • NN processing

  • Streaming to host

  • And finally, eventual extra latency until it reaches the app

Note: if the FPS is increased slightly more, towards 19..21 FPS, an extra latency of about 10ms appears, that we believe is related to firmware. We are activaly looking for improvements for lower latencies.

NN input queue size and blocking behaviour

If the app has detNetwork.input.setBlocking(False), but the queue size doesn’t change, the following adjustment may help improve latency performance:

By adding detNetwork.input.setQueueSize(1), while setting back the camera FPS to 40, we get about 80.. 105ms latency. One of the causes of being non-deterministic is that the camera is producing at a different rate (25ms frame-time), vs. when NN has finished and can accept a new frame to process.

Got questions?

We’re always happy to help with code or other questions you might have.