Use a Pre-trained OpenVINO model

In this tutorial, you’ll learn how to use a pre-trained face detection model to detect faces in real-time, even on a low-powered Raspberry Pi.


If you would like to learn more about OpenVINO, Open Model Zoo and how to locally convert OpenVINO model into .blob, check out our Local OpenVINO Model Conversion

Run DepthAI Default Model

The file can be modified directly to you do your bidding, or you can simply pass arguments to it for which models you want to run.

For simplicity we will do the latter, simply passing arguments so that DepthAI runs the face-detection-retail-0004 instead of the model run by default.

Before switching to using the face-detection-retail-0004 let’s take a baby step and give these command line options a spin. In this case we’ll just pass in the same neural network that default runs when running python3, just to make sure we’re doing it right:

python3 -dd

This will then run the a typical demo MobileNetv2 SSD object detector trained on the PASCAL 2007 VOC classes, which are:

  • Person: person

  • Animal: bird, cat, cow, dog, horse, sheep

  • Vehicle: airplane, bicycle, boat, bus, car, motorbike, train

  • Indoor: bottle, chair, dining table, potted plant, sofa, TV/monitor

I ran this on my laptop with an OAK-D sitting on my desk pointing upwards randomly - and it makes out the corner of my laptop screen and correctly identifies it as tvmonitor:


Run model

Now that we’ve got this verified, let’s move on to trying out other models, starting with face-detection-retail-0004.

To use this model, simply specify the name of the model to be run with the -cnn flag, as below:

python3 -dd -cnn face-detection-retail-0004

This will download the compiled face-detection-retail-0004 NN model and use it to run inference (detect faces) on color frames:


It’s that easy. Substitute your face for mine, of course.

And if you’d like to try other models, just peruse here and run them by their name, just like above.

Now take some time to play around with the model. You can for example check how far away the model can detect your face:

face face

In the latter image you can see that I’m quite back-lit, which is one of the main challenges in face detection (and other feature detection). In this case, it’s likely limiting the maximum range for which a face can be detected. From the testing above, for a confidence threshold of 50%, this range appears to be about 20 feet. You could get longer range out of the same model by reducing the model confidence threshold (by changing from 0.5 here) at the cost of increased probability of false positives.

Another limiting factor is that this is a relatively low-resolution model (300x300 pixels), so faces get fairly small fairly fast at a distance. So let’s try another face detection model that uses a higher resolution.

Trying Other Models

The flow we walked through works for other pre-trained models in our repository (here), which includes:

  • People semantic segmentation (deeplabv3p_person)

  • Face detection (ADAS) (face-detection-adas-0001)

  • Face detection (retail, the one we used) (face-detection-retail-0004)

  • Mobilenet general object detection (default model) (mobilenet-ssd)

  • Pose estimation (openpose2)

  • Pedestrian detection (ADAS) (pedestrian-detection-adas-0002)

  • Person detection (retail) (person-detection-retail-0013)

  • Person, Vehicle and Bike Detection (person-vehicle-bike-detection-crossroad-1016)

  • tiny Yolo - general object detection (tiny-yolo-v3)

  • Vehicle detection for driver-assistance (vehicle-detection-adas-0002)

  • Vehicle and license plate detection (vehicle-license-plate-detection-barrier-0106)

  • Yolo - general object detection (yolo-v3)

You can simply specify any of these models after the -cnn argument.

Let’s try out face-detection-adas-0001, which is intended for detecting faces inside the cabin of a vehicle. (ADAS stands for Advanced Driver-Assistance Systems)

python3 -dd -cnn face-detection-adas-0001

So this model actually has a shorter detection distance than the smaller model despite having a higher resolution. Why? Likely because it was intentionally trained to detect only close-in faces since it’s intended to be used in the cabin of a vehicle. (You wouldn’t want to be detecting the faces in cars passing by, for example.)

Spatial AI - Augmenting the Model with 3D Position

So by default DepthAI is set to return the full 3D position. So in the command above, we actually specify for it to not be calculated with -dd (or --disable_depth).

So let’s run that same command, but with that line omitted, such that 3D results are returned (and displayed):

python3 -cnn face-detection-retail-0004

And there you find the 3D position of my face!

You can then choose other models and get real-time 3D position for the class of interest.

Play with the feature and please share demos that you come up with (especially if you make a robot that stalks your cat) on and if you run into any issues, please ping us on our Discord server.

And if you find any errors in these documents, please report an issue on the Github.

Got questions?

We’re always happy to help with code or other questions you might have.