# StereoDepth

StereoDepth node calculates the disparity and/or depth from the stereo camera pair (2x
[MonoCamera](https://docs.luxonis.com/software/depthai/examples/mono_camera_control.md)/[ColorCamera](https://docs.luxonis.com/software/depthai-components/nodes/color_camera.md)).
We suggest following [Configuring Stereo Depth](https://docs.luxonis.com/software/depthai-components/nodes/stereo_depth.md)
tutorial to achieve the best depth results.

## How to place it

#### Python

```python
pipeline = dai.Pipeline()
stereo = pipeline.create(dai.node.StereoDepth)
```

#### C++

```cpp
dai::Pipeline pipeline;
auto stereo = pipeline.create<dai::node::StereoDepth>();
```

## Inputs and Outputs

There are also debug outputs (all are [ImgFrame](https://docs.luxonis.com/software/depthai-components/messages/img_frame.md)):
debugDispLrCheckIt1, debugDispLrCheckIt2, debugExtDispLrCheckIt1, debugExtDispLrCheckIt2, debugDispCostDump and confidenceMap.

## Internal block diagram of StereoDepth node

On the diagram, red rectangle are firmware settings that are configurable via the API. Gray rectangles are settings that that are
not yet exposed to the API. We plan on exposing as much configurability as possible, but please inform us if you would like to see
these settings configurable sooner.

If you click on the image, you will be redirected to the webapp. Some blocks have notes that provide additional technical
information.

## Currently configurable blocks

 * Stereo Mode:
   
   * Left-Right Check:
     
     * Left-Right Check or LR-Check is used to remove incorrectly calculated disparity pixels due to occlusions at object borders
       (Left and Right camera views are slightly different).
       1. Computes disparity by matching in R->L direction.
       2. Computes disparity by matching in L->R direction.
       3. Combines results from steps 1 and 2, running on Shave: each pixel d = disparity_LR(x,y) is compared with
          disparity_RL(x-d,y). If the difference is above a threshold, the pixel at (x,y) in the final disparity map is
          invalidated.
     * You can use debugDispLrCheckIt1 and debugDispLrCheckIt2 debug outputs for debugging/fine-tuning purposes.
   
   * Extended Disparity:
     
     * Extended disparity mode allows detecting closer distance objects for the given baseline. This increases the maximum
       disparity search from 96 to 191, meaning the range is now: [0..190].
       1. Computes disparity on the original size images (e.g. 1280x720).
       2. Computes disparity on 2x downscaled images (e.g. 640x360).
       3. Combines the two level disparities on Shave, effectively covering a total disparity range of 191 pixels (in relation to
          the original resolution).
     * You can use debugExtDispLrCheckIt1 and debugExtDispLrCheckIt2 debug outputs for debugging/fine-tuning purposes.
   
   * Subpixel Disparity:
     
     * Subpixel mode improves the precision and is especially useful for long-range measurements. It also helps for better
       estimating surface normals.
     * In this mode, stereo cameras perform: 94 depth steps * 8 subpixel depth steps + 2 (min/max values) = 754 depth steps.
     * For comparison of normal disparity vs. subpixel disparity images, click
       [here](https://github.com/luxonis/depthai/issues/184).

 * Depth Filters:
   
   * Depth Filtering / [Depth Post-Processing](https://docs.luxonis.com/software/depthai/examples/depth_post_processing.md) is
     performed at the end of the depth pipeline. It helps with noise reduction and overall depth quality.

 * Mesh files:
   
   * Mesh files (homography matrix) are generated using the camera intrinsics, distortion coeffs, and rectification rotations.
     These files help in overcoming the distortions in the camera increasing the accuracy and also help in when wide FOV lens are
     used.
   * Currently, mesh files are generated only for stereo cameras on the host during calibration. The generated mesh files are
     stored in depthai/resources which users can load to the device. This process will be moved to on device in the upcoming
     releases.

 * Confidence Threshold:
   
   * Confidence threshold: Stereo depth algorithm searches for the matching feature from the right camera point to the left image
     (along the 96 disparity levels). During this process, it computes the cost for each disparity level and chooses the minimal
     cost between two disparities and uses it to compute the confidence at each pixel. Stereo node will output disparity/depth
     pixels only where depth confidence is below the confidence threshold (lower the confidence value means better depth
     accuracy).
   * LR check threshold: Disparity is considered for the output when the difference between LR and RL disparities is smaller than
     the LR check threshold.

## Limitations

 * Median filtering is disabled when subpixel mode is set to 4 or 5 bits.
 * For RGB-depth alignment the RGB camera has to be placed on the same horizontal line as the stereo camera pair.
 * RGB-depth alignment doesn't work when using disparity shift.

## Changing camera calibration during runtime

Since the depthai-2.29.0 release, changing the camera calibration during runtime is possible. The calibration can be changed using
the device.setCalibration() method and can be used if device stereo quality degrades over time due to environmental factors. The
same data can be obtained using the device.getCalibration().

## Depth Presets

We have made a few depth presets that can be used to quickly set up the StereoDepth node for different scenarios, without having
to manually set all the parameters listed under [Configutring stereo
depth](https://docs.luxonis.com/hardware/platform/depth/configuring-stereo-depth.md).

### Depth Presets

| Feature | Default | Face | High Detail | Robotics |
| --- | --- | --- | --- | --- |
| Base Profile | HIGH_DENSITY | HIGH_DENSITY | HIGH_ACCURACY | HIGH_DENSITY |
| Median Filter | 7x7 kernel | Off | Off | 7x7 kernel |
| Subpixel Precision | 3-bit | 5-bit | 5-bit | 3-bit |
| Extended Disparity | No | Yes | Yes | No |
| Range | 0-15m | 3cm-3m | 0-15m | 0-10m |
| Temporal Filter | Enabled | Enabled | Enabled | Disabled |
| Use Case | General purpose | Face tracking, biometrics | 3D scanning, object details | Navigation, obstacle detection |

All presets include:

 * Left-right consistency check
 * 2x decimation with pixel skipping
 * Speckle filtering (range: 200, threshold: 2)
 * 3 hardware resources for [post-processing](https://docs.luxonis.com/software/depthai/examples/depth_post_processing.md)

## Stereo depth FPS

StereoDepth handles quite a few complex operations across multiple hardware blocks, and the FPS depends on the setting of this
node. So instead of FPS, let's look at the latency of each operation.

First in the pipeline (see [Internal block diagram](#StereoDepth-Internal%2520block%2520diagram%2520of%2520StereoDepth%2520node))
is rectification, which runs on warp engine. Because wide FOV frames requires a more complex mesh to get rectified, it takes a bit
longer to process.

| Process \ Resolution | 400P [ms] | 720P [ms] | 800P [ms] |
| --- | --- | --- | --- |
| Rectification | 1.5 | 5 | 6 |
| Wide FOV rectification | 2.5 | 8 | 9 |

Then we have actual stereo disparity matching, which runs on it's own stereo engine. This is usually the most time-consuming part
(unless we have heavy [post-processing filters](https://docs.luxonis.com/software/depthai/examples/depth_post_processing.md)),
which means it's likely the bottleneck in the pipeline. First the glossary:

 * LR: Left-Right check (stereo.setLeftRightCheck()). It requires 2 stereo matching operations (L->R and R->L direction) and then
   a combines results
 * Ext: Extended disparity mode (stereo.setExtendedDisparity()). It requires 2 stereo matching operations (original and
   downscaled) and then combines results
 * Sub: Subpixel matching (stereo.setSubpixel()). For each pixel you get 96 bytes for cost map (one byte for each disparity
   level), so for 800P frame about 98MB of data needs to get moved to CMX memory (used by SHAVEs) for subpixel calculation. So
   bottleneck in this case is memory bandwidth (~4.1GB/sec). Number of subpixel bits doesn't affect latency

| Process \ Resolution | 400P [ms] | 720P [ms] | 800P [ms] |
| --- | --- | --- | --- |
| No LR, No Ext, No Sub | 3.4 | 7.4 | 8.1 |
| No LR, No Ext, Sub | 6.2 | 21.7 | 23.9 |
| No LR, Ext, No Sub | 8.5 | 21.5 | 23.5 |
| No LR, Ext, Sub | 11 | 32.7 | 36.7 |
| LR, No Ext, No Sub | 7.4 | 18 | 19.8 |
| LR, No Ext, Sub | 10.1 | 27.6 | 30.5 |
| LR, Ext, No Sub | 16.7 | 43.7 | 48.3 |
| LR, Ext, Sub | 20.4 | 56 | 62 |

As this step is likely the bottleneck, we can calculate the FPS from it. So if you enable LR, Ext and Sub at 800P, you'll get
around (1 sec/62 ms =) 16 FPS.

At the end we have filtering - like median filtering (running on median hardware block), which is a simple operation and doesn't
add much latency. Kernel type (3x3, 5x5, 7x7) doesn't affect the latency:

| Process \ Resolution | 400P [ms] | 720P [ms] | 800P [ms] |
| --- | --- | --- | --- |
| Median filtering (`3x3` / `5x5` / `7x7`) | 0.8 | 1.8 | 1.8 |

There are also other [post-processing filters](https://docs.luxonis.com/software/depthai/examples/depth_post_processing.md) which
can take significant compute/latency, but we haven't done extensive measurements for them yet.

## Usage

#### Python

```python
pipeline = dai.Pipeline()
stereo = pipeline.create(dai.node.StereoDepth)

# Better handling for occlusions:
stereo.setLeftRightCheck(False)
# Closer-in minimum depth, disparity range is doubled:
stereo.setExtendedDisparity(False)
# Better accuracy for longer distance, fractional disparity 32-levels:
stereo.setSubpixel(False)

# Define and configure MonoCamera nodes beforehand
left.out.link(stereo.left)
right.out.link(stereo.right)
```

#### C++

```cpp
dai::Pipeline pipeline;
auto stereo = pipeline.create<dai::node::StereoDepth>();

// Better handling for occlusions:
stereo->setLeftRightCheck(false);
// Closer-in minimum depth, disparity range is doubled:
stereo->setExtendedDisparity(false);
// Better accuracy for longer distance, fractional disparity 32-levels:
stereo->setSubpixel(false);

// Define and configure MonoCamera nodes beforehand
left->out.link(stereo->left);
right->out.link(stereo->right);
```

## Examples of functionality

 * [Depth Preview](https://docs.luxonis.com/software/depthai/examples/depth_preview.md)
 * [RGB Depth alignment](https://docs.luxonis.com/software/depthai/examples/rgb_depth_aligned.md) - align depth map to color
   camera (RGB-D)
 * [Stereo Depth Post-Processing filters](https://docs.luxonis.com/software/depthai/examples/depth_post_processing.md)
 * [Mono & MobilenetSSD & Depth](https://docs.luxonis.com/software/depthai/examples/mono_depth_mobilenetssd.md)
 * [RGB & MobilenetSSD with spatial
   data](https://docs.luxonis.com/software/depthai-components/nodes/mobilenet_spatial_detection_network.md)

## Reference

### depthai.node.StereoDepth(depthai.Node)

Kind: Class

StereoDepth node. Compute stereo disparity and depth from left-right image pair.

#### depthai.node.StereoDepth.PresetMode

Kind: Class

Preset modes for stereo depth.

Members:

HIGH_ACCURACY : **Deprecated:** Will be removed in future releases and replaced with DEFAULT

HIGH_DENSITY : **Deprecated:** Will be removed in future releases and replaced with DEFAULT

DEFAULT

FACE

HIGH_DETAIL

ROBOTICS

##### DEFAULT: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### FACE: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### HIGH_ACCURACY: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### HIGH_DENSITY: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### HIGH_DETAIL: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### ROBOTICS: typing.ClassVar[StereoDepth.PresetMode]

Kind: Class Variable

##### __members__: typing.ClassVar[dict[str, StereoDepth.PresetMode]]

Kind: Class Variable

##### __eq__(self, other: typing.Any) -> bool: bool

Kind: Method

##### __getstate__(self) -> int: int

Kind: Method

##### __hash__(self) -> int: int

Kind: Method

##### __index__(self) -> int: int

Kind: Method

##### __init__(self, value: typing.SupportsInt)

Kind: Method

##### __int__(self) -> int: int

Kind: Method

##### __ne__(self, other: typing.Any) -> bool: bool

Kind: Method

##### __repr__(self) -> str: str

Kind: Method

##### __setstate__(self, state: typing.SupportsInt)

Kind: Method

##### __str__(self) -> str: str

Kind: Method

##### name

Kind: Property

##### value

Kind: Property

#### enableDistortionCorrection(self, arg0: bool)

Kind: Method

Equivalent to useHomographyRectification(!enableDistortionCorrection)

#### getMaxDisparity(self) -> float: float

Kind: Method

Useful for normalization of the disparity map.

Returns:
Maximum disparity value that the node can return

#### loadMeshData(self, dataLeft: collections.abc.Sequence [ typing.SupportsInt ], dataRight: collections.abc.Sequence [
typing.SupportsInt ])

Kind: Method

Specify mesh calibration data for 'left' and 'right' inputs, as vectors of
bytes. Overrides useHomographyRectification behavior. See `loadMeshFiles` for
the expected data format

#### loadMeshFiles(self, pathLeft: Path, pathRight: Path)

Kind: Method

Specify local filesystem paths to the mesh calibration files for 'left' and
'right' inputs.

When a mesh calibration is set, it overrides the camera intrinsics/extrinsics
matrices. Overrides useHomographyRectification behavior. Mesh format: a sequence
of (y,x) points as 'float' with coordinates from the input image to be mapped in
the output. The mesh can be subsampled, configured by `setMeshStep`.

With a 1280x800 resolution and the default (16,16) step, the required mesh size
is:

width: 1280 / 16 + 1 = 81

height: 800 / 16 + 1 = 51

#### setAlphaScaling(self, arg0: typing.SupportsFloat)

Kind: Method

Free scaling parameter between 0 (when all the pixels in the undistorted image
are valid) and 1 (when all the source image pixels are retained in the
undistorted image). On some high distortion lenses, and/or due to rectification
(image rotated) invalid areas may appear even with alpha=0, in these cases alpha
< 0.0 helps removing invalid areas. See getOptimalNewCameraMatrix from opencv
for more details.

#### setBaseline(self, arg0: typing.SupportsFloat)

Kind: Method

Override baseline from calibration. Used only in disparity to depth conversion.
Units are centimeters.

#### setConfidenceThreshold(self, arg0: typing.SupportsInt)

Kind: Method

Confidence threshold for disparity calculation

Parameter ``confThr``:
Confidence threshold value 0..255

#### setDefaultProfilePreset(self, arg0: StereoDepth.PresetMode)

Kind: Method

Sets a default preset based on specified option.

Parameter ``mode``:
Stereo depth preset mode

#### setDepthAlign()

Kind: Method

#### setDepthAlignmentUseSpecTranslation(self, arg0: bool)

Kind: Method

Use baseline information for depth alignment from specs (design data) or from
calibration. Default: true

#### setDisparityToDepthUseSpecTranslation(self, arg0: bool)

Kind: Method

Use baseline information for disparity to depth conversion from specs (design
data) or from calibration. Default: true

#### setEmptyCalibration(self)

Kind: Method

Specify that a passthrough/dummy calibration should be used, when input frames
are already rectified (e.g. sourced from recordings on the host)

#### setExtendedDisparity(self, enable: bool)

Kind: Method

Disparity range increased from 0-95 to 0-190, combined from full resolution and
downscaled images.

Suitable for short range objects. Currently incompatible with sub-pixel
disparity

#### setFocalLength(self, arg0: typing.SupportsFloat)

Kind: Method

Override focal length from calibration. Used only in disparity to depth
conversion. Units are pixels.

#### setFocalLengthFromCalibration(self, arg0: bool)

Kind: Method

Whether to use focal length from calibration intrinsics or calculate based on
calibration FOV. Default value is true.

#### setInputResolution()

Kind: Method

#### setLeftRightCheck(self, enable: bool)

Kind: Method

Computes and combines disparities in both L-R and R-L directions, and combine
them.

For better occlusion handling, discarding invalid disparity values

#### setMedianFilter(self, arg0: depthai.MedianFilter)

Kind: Method

Parameter ``median``:
Set kernel size for disparity/depth median filtering, or disable

#### setMeshStep(self, width: typing.SupportsInt, height: typing.SupportsInt)

Kind: Method

Set the distance between mesh points. Default: (16, 16)

#### setNumFramesPool(self, arg0: typing.SupportsInt)

Kind: Method

Specify number of frames in pool.

Parameter ``numFramesPool``:
How many frames should the pool have

#### setOutputDepth(self, arg0: bool)

Kind: Method

#### setOutputKeepAspectRatio(self, keep: bool)

Kind: Method

Specifies whether the frames resized by `setOutputSize` should preserve aspect
ratio, with potential cropping when enabled. Default `true`

#### setOutputRectified(self, arg0: bool)

Kind: Method

#### setOutputSize(self, width: typing.SupportsInt, height: typing.SupportsInt)

Kind: Method

Specify disparity/depth output resolution size, implemented by scaling.

Currently only applicable when aligning to RGB camera

#### setPostProcessingHardwareResources(self, arg0: typing.SupportsInt, arg1: typing.SupportsInt)

Kind: Method

Specify allocated hardware resources for stereo depth. Suitable only to increase
post processing runtime.

Parameter ``numShaves``:
Number of shaves.

Parameter ``numMemorySlices``:
Number of memory slices.

#### setRectification(self, enable: bool)

Kind: Method

Rectify input images or not.

#### setRectificationUseSpecTranslation(self, arg0: bool)

Kind: Method

Obtain rectification matrices using spec translation (design data) or from
calibration in calculations. Should be used only for debugging. Default: false

#### setRectifyEdgeFillColor(self, color: typing.SupportsInt)

Kind: Method

Fill color for missing data at frame edges

Parameter ``color``:
Grayscale 0..255, or -1 to replicate pixels

#### setRectifyMirrorFrame(self, arg0: bool)

Kind: Method

DEPRECATED function. It was removed, since rectified images are not flipped
anymore. Mirror rectified frames, only when LR-check mode is disabled. Default
`true`. The mirroring is required to have a normal non-mirrored disparity/depth
output.

A side effect of this option is disparity alignment to the perspective of left
or right input: `false`: mapped to left and mirrored, `true`: mapped to right.
With LR-check enabled, this option is ignored, none of the outputs are mirrored,
and disparity is mapped to right.

Parameter ``enable``:
True for normal disparity/depth, otherwise mirrored

#### setRuntimeModeSwitch(self, arg0: bool)

Kind: Method

Enable runtime stereo mode switch, e.g. from standard to LR-check. Note: when
enabled resources allocated for worst case to enable switching to any mode.

#### setSubpixel(self, enable: bool)

Kind: Method

Computes disparity with sub-pixel interpolation (3 fractional bits by default).

Suitable for long range. Currently incompatible with extended disparity

#### setSubpixelFractionalBits(self, subpixelFractionalBits: typing.SupportsInt)

Kind: Method

Number of fractional bits for subpixel mode. Default value: 3. Valid values:
3,4,5. Defines the number of fractional disparities: 2^x. Median filter
postprocessing is supported only for 3 fractional bits.

#### useHomographyRectification(self, arg0: bool)

Kind: Method

Use 3x3 homography matrix for stereo rectification instead of sparse mesh
generated on device. Default behaviour is AUTO, for lenses with FOV over 85
degrees sparse mesh is used, otherwise 3x3 homography. If custom mesh data is
provided through loadMeshData or loadMeshFiles this option is ignored.

Parameter ``useHomographyRectification``:
true: 3x3 homography matrix generated from calibration data is used for
stereo rectification, can't correct lens distortion. false: sparse mesh is
generated on-device from calibration data with mesh step specified with
setMeshStep (Default: (16, 16)), can correct lens distortion. Implementation
for generating the mesh is same as opencv's initUndistortRectifyMap
function. Only the first 8 distortion coefficients are used from calibration
data.

#### confidenceMap

Kind: Property

Outputs ImgFrame message that carries RAW8 confidence map. Lower values means
higher confidence of the calculated disparity value. RGB alignment, left-right
check or any postproccessing (e.g. median filter) is not performed on confidence
map.

#### debugDispCostDump

Kind: Property

Outputs ImgFrame message that carries cost dump of disparity map. Useful for
debugging/fine tuning.

#### debugDispLrCheckIt1

Kind: Property

Outputs ImgFrame message that carries left-right check first iteration (before
combining with second iteration) disparity map. Useful for debugging/fine
tuning.

#### debugDispLrCheckIt2

Kind: Property

Outputs ImgFrame message that carries left-right check second iteration (before
combining with first iteration) disparity map. Useful for debugging/fine tuning.

#### debugExtDispLrCheckIt1

Kind: Property

Outputs ImgFrame message that carries extended left-right check first iteration
(downscaled frame, before combining with second iteration) disparity map. Useful
for debugging/fine tuning.

#### debugExtDispLrCheckIt2

Kind: Property

Outputs ImgFrame message that carries extended left-right check second iteration
(downscaled frame, before combining with first iteration) disparity map. Useful
for debugging/fine tuning.

#### depth

Kind: Property

Outputs ImgFrame message that carries RAW16 encoded (0..65535) depth data in
depth units (millimeter by default).

Non-determined / invalid depth values are set to 0

#### disparity

Kind: Property

Outputs ImgFrame message that carries RAW8 / RAW16 encoded disparity data: RAW8
encoded (0..95) for standard mode; RAW8 encoded (0..190) for extended disparity
mode; RAW16 encoded for subpixel disparity mode: - 0..760 for 3 fractional bits
(by default) - 0..1520 for 4 fractional bits - 0..3040 for 5 fractional bits

#### initialConfig

Kind: Property

Initial config to use for StereoDepth.

#### inputConfig

Kind: Property

Input StereoDepthConfig message with ability to modify parameters in runtime.
Default queue is non-blocking with size 4.

#### left

Kind: Property

Input for left ImgFrame of left-right pair

Default queue is non-blocking with size 8

#### outConfig

Kind: Property

Outputs StereoDepthConfig message that contains current stereo configuration.

#### rectifiedLeft

Kind: Property

Outputs ImgFrame message that carries RAW8 encoded (grayscale) rectified frame
data.

#### rectifiedRight

Kind: Property

Outputs ImgFrame message that carries RAW8 encoded (grayscale) rectified frame
data.

#### right

Kind: Property

Input for right ImgFrame of left-right pair

Default queue is non-blocking with size 8

#### syncedLeft

Kind: Property

Passthrough ImgFrame message from 'left' Input.

#### syncedRight

Kind: Property

Passthrough ImgFrame message from 'right' Input.

## Disparity

Disparity refers to the distance between two corresponding points in the left and right image of a stereo pair.

When calculating the disparity, each pixel in the disparity map gets assigned a confidence value 0..255 by the stereo matching
algorithm, as:

 * 0 - maximum confidence that it holds a valid value
 * 255 - minimum confidence, so there is more chance that the value is incorrect

(this confidence score is kind-of inverted, if say comparing with NN)

For the final disparity map, a filtering is applied based on the confidence threshold value: the pixels that have their confidence
score larger than the threshold get invalidated, i.e. their disparity value is set to zero. You can set the confidence threshold
with stereo.initialConfig.setConfidenceThreshold().

## Calculate depth using disparity map

Disparity and depth are inversely related. As disparity decreases, depth increases exponentially depending on baseline and focal
length. Meaning, if the disparity value is close to zero, then a small change in disparity generates a large change in depth.
Similarly, if the disparity value is big, then large changes in disparity do not lead to a large change in depth.

By considering this fact, depth can be calculated using this formula:

```python
depth = focal_length_in_pixels * baseline / disparity_in_pixels
```

Where baseline is the distance between two mono cameras. Note the unit used for baseline and depth is the same.

To get focal length in pixels, you can [Calibration
Reader](https://docs.luxonis.com/software/depthai/examples/calibration_reader.md), as focal length in pixels is written in camera
intrinsics (intrinsics[0][0]):

```python
import depthai as dai

with dai.Device() as device:
  calibData = device.readCalibration()
  intrinsics = calibData.getCameraIntrinsics(dai.CameraBoardSocket.CAM_C)
  print('Right mono camera focal length in pixels:', intrinsics[0][0])
```

Here's theoretical calculation of the focal length in pixels:

```python
focal_length_in_pixels = image_width_in_pixels * 0.5 / tan(HFOV * 0.5 * PI/180)

# With 400P mono camera resolution where HFOV=71.9 degrees
focal_length_in_pixels = 640 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 441.25

# With 800P mono camera resolution where HFOV=71.9 degrees
focal_length_in_pixels = 1280 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 882.5
```

Examples for calculating the depth value, using the OAK-D (7.5cm baseline):

```python
# For OAK-D @ 400P mono cameras and disparity of eg. 50 pixels
depth = 441.25 * 7.5 / 50 = 66.19 # cm

# For OAK-D @ 800P mono cameras and disparity of eg. 10 pixels
depth = 882.5 * 7.5 / 10 = 661.88 # cm
```

Note the value of disparity depth data is stored in uint16, where 0 is a special value, meaning that distance is unknown.

## Min stereo depth distance

If the depth results for close-in objects look weird, this is likely because they are below the minimum depth-perception distance
of the device.

To calculate this minimum distance, use the [Calculate depth using disparity
map](#StereoDepth-Calculate%2520depth%2520using%2520disparity%2520map) and choose the maximum value for disparity_in_pixels
parameter (keep in mind it is inveresly related, so maximum value will yield the smallest result).

For example OAK-D has a baseline of 7.5cm, focal_length_in_pixels of 882.5 pixels and the default maximum value for
disparity_in_pixels is 95. By using the [Calculate depth using disparity
map](#StereoDepth-Calculate%2520depth%2520using%2520disparity%2520map) we get:

```python
min_distance = focal_length_in_pixels * baseline / disparity_in_pixels = 882.5 * 7.5cm / 95 = 69.67cm
```

or roughly 70cm.

However this distance can be cut in 1/2 (to around 35cm for the OAK-D) with the following options:

 1. Changing the resolution to 640x400, instead of the standard 1280x800.

 2. Enabling Extended Disparity.

Extended Disparity mode increases the levels of disparity to 191 from the standard 96 pixels, thereby 1/2-ing the minimum depth.
It does so by computing the 96-pixel disparities on the original 1280x720 and on the downscaled 640x360 image, which are then
merged to a 191-level disparity. For more information see the Extended Disparity tab in [Currently configurable
blocks](#StereoDepth-Currently%2520configurable%2520blocks).

Using the previous OAK-D example, disparity_in_pixels now becomes 190 and the minimum distance is:

```python
min_distance = focal_length_in_pixels * baseline / disparity_in_pixels = 882.5 * 7.5cm / 190 = 34.84cm
```

or roughly 35cm.

> Applying both of those options is possible, which would set the minimum depth to 1/4 of the standard settings, but at such short
distances the minimum depth is limited by focal length, which is 19.6cm, since OAK-D mono cameras have fixed focus distance:
19.6cm - infinity.

See [these
examples](https://github.com/luxonis/oak-examples/tree/master/gen2-camera-demo#real-time-depth-from-depthai-stereo-pair) for how
to enable Extended Disparity.

## Disparity shift to lower min depth perception

Another option to perceive closer depth range is to use disparity shift. Disparity shift will shift the starting point of the
disparity search, which will significantly decrease max depth (MazZ) perception, but it will also decrease min depth (MinZ)
perception. Disparity shift can be combined with extended/subpixel/LR-check modes.

Left graph shows min and max disparity and depth for OAK-D (7.5cm baseline, 800P resolution, ~70° HFOV) by default (disparity
shift=0). See [Calculate depth using disparity map](#StereoDepth-Calculate%2520depth%2520using%2520disparity%2520map). Since
hardware (stereo block) has a fixed 95 pixel disparity search, DepthAI will search from 0 pixels (depth=INF) to 95 pixels
(depth=71cm).

Right graph shows the same, but at disparity shift set to 30 pixels. This means that disparity search will be from 30 pixels
(depth=2.2m) to 125 pixels (depth=50cm). This also means that depth will be very accurate at the short range (theoretically below
5mm depth error).

Limitations:

 * Because of the inverse relationship between disparity and depth, MaxZ will decrease much faster than MinZ as the disparity
   shift is increased. Therefore, it is advised not to use a larger than necessary disparity shift.
 * Tradeoff in reducing the MinZ this way is that objects at distances farther away than MaxZ will not be seen.
 * Because of the point above, we only recommend using disparity shift when MaxZ is known, such as having a depth camera mounted
   above a table pointing down at the table surface.
 * Output disparity map is not expanded, only the depth map. So if disparity shift is set to 50, and disparity value obtained is
   90, the real disparity is 140.

Compared to Extended disparity, disparity shift:

 * (+) Is faster, as it doesn't require an extra computation, which means there's also no extra latency
 * (-) Reduces the MaxZ (significantly), while extended disparity only reduces MinZ.

Disparity shift can be combined with extended disparity.

### setDisparityShift(self, arg0: typing.SupportsInt) -> StereoDepthConfig: StereoDepthConfig

Kind: Method

Shift input frame by a number of pixels to increase minimum depth. For example
shifting by 48 will change effective disparity search range from (0,95] to
[48,143]. An alternative approach to reducing the minZ. We normally only
recommend doing this when it is known that there will be no objects farther away
than MaxZ, such as having a depth camera mounted above a table pointing down at
the table surface.

## Max stereo depth distance

The maximum depth perception distance depends on the ([Depth perception
accuracy](#StereoDepth-Depth%2520perception%2520accuracy)). The formula used to calculate this distance is an approximation, but
is as follows:

```python
Dm = (baseline/2) * tan((90 - HFOV / HPixels)*pi/180)
```

So using this formula for existing models the theoretical max distance is:

```python
# For OAK-D (7.5cm baseline)
Dm = (7.5/2) * tan((90 - 71.9/1280)*pi/180) = 3825.03cm = 38.25 meters

# For OAK-D-CM4 (9cm baseline)
Dm = (9/2) * tan((90 - 71.9/1280)*pi/180) = 4590.04cm = 45.9 meters
```

If greater precision for long range measurements is required, consider enabling Subpixel Disparity or using a larger baseline
distance between mono cameras. For a custom baseline, you could consider using [OAK-FFC](https://docs.luxonis.com/hardware.md)
device or design your own baseboard PCB with required baseline. For more information see Subpixel Disparity under the Stereo Mode
tab in [Currently configurable blocks](#StereoDepth-Currently%2520configurable%2520blocks).

## Depth perception accuracy

Disparity depth works by matching features from one image to the other and its accuracy is based on multiple parameters:

Texture of objects / backgrounds

Backgrounds may interfere with the object detection, since backgrounds are objects too, which will make depth perception less
accurate. So disparity depth works very well outdoors as there are very rarely perfectly-clean/blank surfaces there - but these
are relatively commonplace indoors (in clean buildings at least).

Lighting

If the illumination is low, the diparity map will be of low confidence, which will result in a noisy depth map.

Baseline / distance to objects

Lower baseline enables us to detect the depth at a closer distance as long as the object is visible in both the frames. However,
this reduces the accuracy for large distances due to less pixels representing the object and disparity decreasing towards 0 much
faster. So the common norm is to adjust the baseline according to how far/close we want to be able to detect objects.

## Limitation

Since depth is calculated from disparity, which requires the pixels to overlap, there is inherently a vertical band on the left
side of the left mono camera and on the right side of the right mono camera, where depth cannot be calculated, since it is seen by
only 1 camera. That band is marked with B on the following picture.

Meaning of variables on the picture:

 * BL [cm] - Baseline of stereo cameras.
 * Dv [cm] - Minimum distace where both cameras see an object (thus where depth can be calculated).
 * B [pixels] - Width of the band where depth cannot be calculated.
 * W [pixels] - Width of mono in pixels camera or amount of horizontal pixels, also noted as HPixels in other formulas.
 * D [cm] - Distance from the camera plane to an object (see image [Measuring real-world object
   dimensions](#StereoDepth-Measuring%2520real-world%2520object%2520dimensions)).
 * F [cm] - Width of image at the distance D.

With the use of the tan function, the following formulas can be obtained:

 * F = 2 * D * tan(HFOV/2)
 * Dv = (BL/2) * tan(90 - HFOV/2)

In order to obtain B, we can use tan function again (same as for F), but this time we must also multiply it by the ratio between W
and F in order to convert units to pixels. That gives the following formula:

```python
B = 2 * Dv * tan(HFOV/2) * W / F
B = 2 * Dv * tan(HFOV/2) * W / (2 * D * tan(HFOV/2))
B = W * Dv / D  # pixels
```

Example: If we are using OAK-D, which has a HFOV of 72°, a baseline (BL) of 7.5 cm and 640x400 (400P) resolution is used,
therefore W = 640 and an object is D = 100 cm away, we can calculate B in the following way:

```bash
Dv = 7.5 / 2 * tan(90 - 72/2) = 3.75 * tan(54°) = 5.16 cm
B = 640 * 5.16 / 100 = 33 # pixels
```

Credit for calculations and images goes to our community member gregflurry, which he made on
[this](https://discuss.luxonis.com/d/339-naive-question-regarding-stereodepth-disparity-and-depth-outputs/7) forum post.

> OAK-D-PRO will include both IR dot projector and IR LED, which will enable operation in no light. IR LED is used to illuminate
the whole area (for mono/color frames), while IR dot projector is mostly for accurate disparity matching - to have good quality
depth maps on blank surfaces as well. For outdoors, the IR laser dot projector is only relevant at night. For more information see
the development progress
> [here](https://github.com/luxonis/oak-hardware/issues/114)
> .

## Measuring real-world object dimensions

Because the depth map contains the Z distance, objects in parallel with the camera are measured accurately standard. For objects
not in parallel, the Euclidean distance calculation can be used. Please refer to the below:

When running eg. the [RGB & MobilenetSSD with spatial
data](https://docs.luxonis.com/software/depthai-components/nodes/mobilenet_spatial_detection_network.md) example, you could
calculate the distance to the detected object from XYZ coordinates
([SpatialImgDetections](https://docs.luxonis.com/software/depthai-components/messages/spatial_img_detections.md)) using the code
below (after code line 143 of the example):

```python
distance = math.sqrt(detection.spatialCoordinates.x ** 2 + detection.spatialCoordinates.y ** 2 + detection.spatialCoordinates.z ** 2) # mm
```

### Need assistance?

Head over to [Discussion Forum](https://discuss.luxonis.com/) for technical support or any other questions you might have.
