ON THIS PAGE

  • Simplified Conversion for Yolos
  • Overview
  • Example
  • Options of Tools

Simplified Conversion for Yolos

Overview

We created a tool to simplify the export process of the Yolo object detectors. Namely, our tools support the conversion of Yolos ranging from V5 through V11 and Gold Yolo.Upload the weights of the pre-trained model (.pt file), set an input image shape, choose the Robotics Vision Core generation of the target device (learn more about parameters in Options of Tools) and we'll compile a blob and a JSON file with information required by DepthAI to decode the results.

Example

This example shows how to convert YoloV6n R4 and how to run the compiled model on an OAK device.
  1. First, we will start by downloading the model's weights. You can download them from here.
  2. Open the tools in a browser of your choice. Then, upload the downloaded yolov6n.pt weights and set the Input image shape to 640 352 (we choose this input image shape as the aspect ratio is close to 16:9 and throughput and latency are still decent). The rest of the options are left as they are.
  3. Click on the Submit button. The model will be automatically converted and downloaded inside a zip folder (the zip folder will contain a converted blob file, JSON file alongside a intermediate representation used to generate the output blob file).
  4. Extract the ZIP folder locally, then run inference on the exported model using the script below. Make sure to set MODEL_PATH, and refer to the .json file for the required input size and label information.
Python
1import depthai as dai
2from depthai_nodes.node import YOLOExtendedParser
3
4MODEL_PATH = ... # e.g. "yolov6n.blob"
5INPUT_SIZE = ... # e.g. (640, 352)
6LABELS = ... # e.g. ["person","bicycle","car", ...]
7
8visualizer = dai.RemoteConnection(httpPort=8080)
9device = dai.Device()
10platform = device.getPlatform().name
11frame_type = (
12    dai.ImgFrame.Type.BGR888i if platform == "RVC4" else dai.ImgFrame.Type.BGR888p
13)
14
15with dai.Pipeline(device) as pipeline:
16
17    # model node
18    nn = pipeline.create(dai.node.NeuralNetwork)
19    nn.setModelPath(MODEL_PATH)
20
21    # parser node
22    parser = pipeline.create(YOLOExtendedParser, n_classes=len(LABELS), label_names=LABELS)
23
24    # input node
25    camera = pipeline.create(dai.node.Camera).build()
26    camera_stream = camera.requestOutput(size=INPUT_SIZE, type=frame_type)                                  
27
28    # connect nodes
29    camera_stream.link(nn.input)
30    nn.out.link(parser.input)
31
32    # visualize
33    visualizer.addTopic("Video", nn.passthrough, "images")
34    visualizer.addTopic("Detections", parser.out, "images")
35
36    pipeline.start()
37    visualizer.registerPipeline(pipeline)
38
39    while pipeline.isRunning():
40        key = visualizer.waitKey(1)
41        if key == ord("q"):
42            print("Got q key from the remote connection!")
43            break

Options of Tools

  • Yolo Version - Required, which Yolo version conversion should be used. Inside tools, an automatic Yolo version detector is integrated, which, when you upload a model's weights, will automatically detect the Yolo version and set it.
  • RVC2 or RVC3 - Required, Robotics Vision Core generation of the target device.
  • File - Required, weights of a pre-trained model (.pt file), size needs to be smaller than 300Mb.
  • Input image shape - Required, integer for square input image shape, or width and height separated by a space. It must be divisible by 32 (or 64, depending on the stride).
  • Shaves - Optional, default value is 6. Number of shaves used. To read more about shaves, please refer to here.
  • Use OpenVINO 2021.4 - Optional, default value is true. This checkbox controls whether, during compilation to IR, the legacy frontend flag will be used. If off, defaults to OpenVINO 2022.1. Slight performance degradation was noticed with 2022.1. Therefore, we recommend setting it to true.