Build a Smart Home AI, What kind of Datasets We Need?

Nexdata
4 min readDec 10, 2022

One of the most important scenarios for conversational artificial intelligence is smart home, which can be widely used in speakers, TVs, mobile phones, robots and other fields.

According to statistics, the size of the global smart home market will reach US$115.7 billion in 2022, and increase to US$195.2 billion in 2026, with an annual growth rate of 13.97%. From the point of view of the consumer market. The penetration rate of smart homes is also on the rise. The household penetration rate of smart homes in the world will reach 14.2% in 2022, and it is expected to increase to 25.0% by 2026.

Smart home is a residential platform, using integrated wiring technology, network communication technology, security technology, automatic control technology, audio and video technology to integrate home life-related facilities to build efficient residential facilities. A management system related to family schedules improves home safety, convenience, comfort, and artistry, and realizes an environmentally friendly and energy-saving living environment.

With the development of machine learning, pattern recognition and Internet of Things technology, a variety of interaction modes have been brought, making household products more intelligent and humanized. Related products are gradually developing from mobile phone control to human-computer interaction mode, and are gradually replaced by other more optimized smart home system control modes. From the earliest Wi-Fi networking control to today’s fingerprint and voice recognition, the interactive performance of smart home products has gradually improved.

Datatang has developed multiple sets of high-quality training data for smart home scenarios, which can be applied to tasks such as voice interaction, voice control, gesture control, and abnormal behavior detection.

50 Hours — American Children Speech Data by Microphone

It is recorded by 219 American children native speakers. The recording texts are mainly storybook, children’s song, spoken expressions, etc. 350 sentences for each speaker. Each sentence contain 4.5 words in average. Each sentence is repeated 2.1 times in average. The recording device is hi-fi Blueyeti microphone. The texts are manually transcribed.

521 People — Mandarin Voiceprint Recognition Speech Data by Mobile Phone

Each person’s time span is very long, which can better cover the sound features of a person in different periods and different states.

1,417 People — 3D Living_Face & Anti_Spoofing Data

The collection scenes include indoor and outdoor scenes. The dataset includes males and females. The age distribution ranges from juvenile to the elderly, the young people and the middle aged are the majorities. The device includes iPhone X, iPhone XR. The data diversity includes various expressions, facial postures, anti-spoofing samples, multiple light conditions, multiple scenes. This data can be used for tasks such as 3D face recognition, 3D Living_Face & Anti_Spoofing.

5,438 People — Infrared Face Recognition Data

The collecting scenes of this dataset include indoor scenes and outdoor scenes. The data includes male and female. The age distribution ranges from child to the elderly, the young people and the middle aged are the majorities. The collecting device is realsense D453i. The data diversity includes multiple age periods, multiple facial postures, multiple scenes. The data can be used for tasks such as infrared face recognition.

Multi-pose and Multi-expression Face Data

1,507 People 102,476 Images Multi-pose and Multi-expression Face Data. The data includes 1,507 Asians (762 males, 745 females). For each subject, 62 multi-pose face images and 6 multi-expression face images were collected. The data diversity includes multiple angles, multiple poses and multple light conditions image data from all ages. This data can be used for tasks such as face recognition and facial expression recognition.

3D Instance Segmentation and 22 Landmarks Annotation Data of Human Body

18,880 Images of 466 People — 3D Instance Segmentation and 22 Landmarks Annotation Data of Human Body. The dataset diversity includes multiple scenes, light conditions, ages, shooting angles, and poses. In terms of annotation, we adpoted instance segmentation annotations on human body. 22 landmarks were also annotated for each human body. The dataset can be used for tasks such as human body instance segmentation and human behavior recognition.

18_Gestures Recognition Data

314,178 Images 18_Gestures Recognition Data. This data diversity includes multiple scenes, 18 gestures, 5 shooting angels, multiple ages and multiple light conditions. For annotation, gesture 21 landmarks (each landmark includes the attribute of visible and visible), gesture type and gesture attributes were annotated. This data can be used for tasks such as gesture recognition and human-machine interaction.

Besides, Datatang supports various types of data collection requirements in various scenarios, and the collection content covers full-dimensional data such as images, texts, voices, and videos. Datatang has professional data collection equipment, rich experience in data collection projects and data quality control experience.

In the process of data collection, Datatang strictly abides by the relevant regulations of GDPR on personal privacy data protection, and has passed the ISO27001 information security management system certification to fully escort data security.

End

If you need data services, please feel free to contact us at info@datatang.com.

--

--

Nexdata

Off-the-shelf AI training data, on-demand data collection & annotation services