Robotics Vision Core 3 (RVC3)
Robotics Vision Core 3 (RVC3 in short) is the third generation of our RVC. It encapsulates three main components:- DepthAI features that are fine-tuned for the particular SoC
- A performant SoC and all its support circuitry (HS PCB layout, power delivery network, efficient heat dissipation, etc.)
- Out-of-the-box connectivity with Luxonis Hub - our cloud platform, which allows for an end-to-end integration of the perception stack.
RVC3-based devices
Here's the list of devices that are built on top of the RVC3:- RAE - Desktop robot for evaluation of DepthAI and rapid prototyping of robotics applications
- OAK-FFC 6P - A modular camera kit great for prototyping
State of the RVC3
First and most importantly, we want to be clear that Luxonis is 100% going to continue supporting the RVC3 architecture including for a number of standard and customer-specific devices that it is being used for. RVC3 unlocks a number of improved capabilities over RVC2 and we will be supporting RVC3 for years to come to help enable those capabilities for our customers.However, the RVC3 does have some limitations. It's built on the Movidius Keembay, which Intel has deprioritized. The primary limitation is in AI performance. Despite having higher TOPS, it lacks support for some neural network layers/operations, and some NN operations/layers are not optimized for the AI subsystem.This was not the case for its predecessor (RVC2). Intel released the Neural Compute Stick 2 (NCS2) based on the Myriad X chip, and it was used as general AI compute hardware and had optimization for most NN operations.This difference results in some models performing really well on the RVC3, while others are either unsupported or are significantly slower than those on the RVC2. For instance:- ResNet50 (classification) FPS - RVC2: 29, RVC3: 114, marking almost a 400% increase. For comparison, RVC4 achieves 934 FPS for ResNet50.
- Yolov5N (object detection) FPS - RVC2: 40, RVC3: 6, showing an approximately 85% decrease in FPS.
- Yolov6N (object detection) FPS - RVC2: 60, RVC3: 47, a mere 20% decrease.
RVC3 Performance
- 3.0 TOPS for AI with INT8 quantization support
- Quad-core ARM A53 @ 1.5GHz, running Yocto Linux, acting as a host computer
- Imaging: ISP, max 6 cameras, 500 MP/s HDR, 3A
- Run any AI model, even custom-architectured/built ones - models need to be converted.
- Cloud platform - The Luxonis Hub - connectivity out-of-the-box
- On-device SLAM / VIO support
- Encoding: H.264, H.265, MJPEG - 4K/75FPS, Decoding: 4K/60FPS
- Computer vision: warp/dewarp, resize, crop via ImageManip node, edge detection, feature tracking. You can also run custom CV functions
- Object tracking: 2D and 3D tracking with ObjectTracker node
RVC3 compared to RVC2
These are the main differences compared to RVC2:- Integrated quad-core ARM A53 running YOCTO Linux (details)
- Enhanced stereo depth perception (details)
- NN INT8 quantization support (details)
Quad-core ARM
Having a Quad-core ARM A53 1.5GHz with Neon technology and floating point extensions (Linux 5.3) integrated into the VPU is similar to having RVC2 + Raspberry Pi 3B+ (quad-core A53 1.4GHz), which can make final projects and products more compact.Custom applications
Users will be able to execute custom containerized apps on the ARM processor on the S3 devices via Luxonis Hub. These containerized apps will also be able to interface with GPIOs and communication interfaces (I2C, UART...), so customers will be able to eg. read from custom sensor, or communicate with a microprocessor directly from the OAK-SoM MAX.It will also be possible to use RVC3-based devices as the previous version (eg. OAK-D, OAK-D-Lite); to connect it via the USB to your computer, and just start an application. With on-board computing capability, programs/apps will be able to do full model decoding on the device itself, which would allow DepthAI apps to be more flexible and have lower latency.SLAM / VIO
Since Series 3 OAK cameras has an on-board quad-core ARM, it will be possible to run VIO or SLAM software stacks on the OAK camera itself. Sparse SLAM is supported on-device, for dense SLAM additional host computing might be required (TBD).Stereo Depth
Series 3 OAK devices feature CNN-based calculation of pixel descriptors, compared to census transform that's being used in previous OAK series.NN quantization
RVC3 supports FP16 and INT8 datatype. OpenVINO provides tools for quantization of models as well, so converting the model won't be any different from converting the model for RVC2 (which supports only FP16).INT8 quantization improves inference performance of some neural model layers.RVC3 has 20 DPU (Data Processing Units) integrated which are capable of delivering 5.12 TOPS (INT8) or 1.28 TFLOPS (FP16).RVC3 Specifications
Specification | Value |
---|---|
nominal VPU clock | 500 MHz |
ResNet-50 performance | 240 inferences per second |
AI TOPS | 3.0 TOPS |
SHAVE processors | 12 |
Computer Vision | CV/Warp acceleration at 1.0 GB/s. 6DOF motion mask support |
Stereo depth | 720P resolution at 180 FPS |
Video encoding | Max 4K 75FPS. H.264, H.265 and JPEG codecs |
Video decoding | Max 4K 60FPS, max 10 channels of 1080P/30FPS. H.264, H.265 and JPEG codecs |
Imaging | ISP, Max 6 cameras, 500 MP/s HDR, TNF, 3A, ULL. 4K/60FPS support |
Interfaces | Multiple I2C, Quad-SPI, I2S, UART, PCIe Gen4 interfaces, USB 3.1/2, 1GB ethernet, many GPIOs |
Operating temperature | -40°C to 105°C (same as RVC2) |
RAM support | 2x 32-bit DRAM at 1600-2133 MHz |
Native media support
- GStreamer framework
- OpenCV (or G-API) for computer vision
- Video Acceleration API / Intel Media SDK for encoding and decoding
RVC3 HDR support
RVC3 supports HDR (High Dynamic Range) mode, which allows to capture images with a higher dynamic range than the standard mode. Supported camera sensors:- IMX412 driver (that we reuse for IMX577/IMX477 as well), requires 12MP resolution and 4 MIPI lanes
- IMX327 driver (that we reuse for IMX462), requires 1080P and 4 MIPI lanes
Python
1colorCam1.initialControl.setMisc("camera-mode", "HDR-2DOL")
2# or, for 3DOL:
3colorCam2.initialControl.setMisc("camera-mode", "HDR-3DOL")
4# Note that 3DOL is only suited for static scenes