In-Cabin Visual Perception in Autonomous Driving

3 min readFeb 19, 2023

The in-cabin visual perception has entered a period of high prosperity. The market demand is also gradually extending from the early simple face recognition and driver fatigue warning to the monitoring of a larger area in the cabin.

According to public data forecasts, the global automotive CIS market will reach US$3.244 billion in 2027, of which the CIS market for in-cabin imaging systems will grow from US$209 million in 2021 to US$571 million in 2027, with a very broad development space .

The use of visual perception technology can improve the user’s driving experience. Through face recognition, the smart cockpit can verify the driver’s identity and switch the personalized central control interface and multimedia settings according to the driver’s identity. Gesture recognition can create a smooth user control that combines a variety of human-computer interactions. Driver status monitoring can be used to identify behavioral states that affect driving safety, such as closing eyes, bowing head, and turning head, and reminding through voice alarms.

What data is needed for in-cabin visual perception？

● Behavior Recognition Data

Real-time monitor the status of drivers and passengers,identifying dangerous driving behaviors, so as to avoid fatigue driving and dangerous driving behaviors.

● Vision Interaction Data

Vision interaction allows users to send instructions through gestures and interacts with the on-board smart devices without touch, creating a more user-friendly driving experience.

● Identity Verification Data

A variety of identity verification methods, easy to solve driver and passenger identity security issues.

Nexdata provides tailored data labeling and annotation services for in-cabin visual perception, such as drivers and passengers’ head orientation detection,facial expressions, sight tracking, gestures detection and etc,. Our data solutions could best assist the training of algorithms to accurately identify and analyze the identity information, intentions, and behaviors of drivers and passengers.

Multi-race — Driver Behavior Collection Data

The data includes multiple ages, multiple time periods and multiple races (Caucasian, Black, Indian). The driver behaviors includes dangerous behavior, fatigue behavior and visual movement behavior. In terms of device, binocular cameras of RGB and infrared channels were applied.

Passenger Behavior Recognition Data

The data includes multiple age groups, multiple time periods and multiple races (Caucasian, Black, Indian). The passenger behaviors include passenger normal behavior, passenger abnormal behavior(passenger carsick behavior, passenger sleepy behavior, passenger lost items behavior). In terms of device, binocular cameras of RGB and infrared channels were applied.

50 Types of Dynamic Gesture Recognition Data

The collecting scenes of this dataset include indoor scenes and outdoor scenes (natural scenery, street view, square, etc.). The data covers males and females (Chinese). The age distribution ranges from teenager to senior. The data diversity includes multiple scenes, 50 types of dynamic gestures, 5 photographic angles, multiple light conditions, different photographic distances.

Lip Language Video Data

1,998 People — Lip Language Video Data. The data diversity includes multiple scenes, multiple ages and multiple time periods. In each video, the lip language of 8-bit Arabic numbers was collected. In this dataset, there are 41,866 videos and the total duration is 86 hours 56 minutes 1.52 seconds. This dataset can be used in tasks such as face anti-spoofing recognition, lip language recognition, etc.

1,417 People — 3D Living_Face & Anti_Spoofing Data

The collection scenes include indoor and outdoor scenes. The dataset includes males and females. The age distribution ranges from juvenile to the elderly, the young people and the middle aged are the majorities. The device includes iPhone X, iPhone XR. The data diversity includes various expressions, facial postures, anti-spoofing samples, multiple light conditions, multiple scenes. This data can be used for tasks such as 3D face recognition, 3D Living_Face & Anti_Spoofing.

About

Founded in 2011, Nexdata is a professional artificial intelligence data service provider and committed to providing high-quality training data and data services for global AI companies. Relying on own data resources, technical advantages and intensive data processing experiences, Nexdata provides data services to 1000+ companies and institutions worldwide.

If you need data services, please feel free to contact us: info@nexdata.ai

In-Cabin Visual Perception in Autonomous Driving

About

Written by Nexdata

No responses yet