Speech recognition technology is a technology that allows machines to convert voice signals into corresponding text or commands through the process of recognition and understanding. Speech recognition technology mainly includes three aspects: feature extraction technology, pattern matching criteria and model training technology. Voice recognition technology has also been fully used in the Internet of Vehicles.
Classification of Speech Recognition Technology
According to different recognition objects, speech recognition tasks can be roughly divided into three categories, namely, isolated word recognition, keyword recognition (or keyword detection, keyword spotTIng) and continuous speech recognition. Among them, the task of isolated word recognition is to recognize isolated words known in advance, such as “power on”, “power off”, etc.; the task of continuous speech recognition is to recognize any continuous speech, such as a sentence or a paragraph. Keyword detection in continuous speech streams is aimed at continuous speech, but it does not recognize all text, but only detects where several known keywords appear, such as detecting “computer” and “world” in a paragraph.
According to the target speaker, speech recognition technology can be divided into specific person speech recognition and non-specific person speech recognition. The former can only recognize the speech of one or several people, while the latter can be used by anyone. Obviously, a non-person-specific speech recognition system is more in line with actual needs, but it is much more difficult than recognition for a specific person.
In addition, according to voice devices and channels, it can be divided into desktop (PC) voice recognition, telephone voice recognition and embedded device (mobile phone, PDA, etc.) voice recognition. Different acquisition channels will deform the acoustic characteristics of human pronunciation, so it is necessary to construct their own recognition systems.
Features of Speech Recognition
Compared with other biometric technologies, speech recognition not only has the characteristics of no loss and forgetting, no need for memory, and convenient use, but also has the advantages of high user acceptance and low cost of voice input equipment. Since the user’s privacy issues are not involved, the application can be promoted conveniently.
Security expert Glen Greer pointed out that while voice recognition is convenient, it is not very reliable due to the risk of impersonation, remote control and low accuracy. A person with a cold could be falsely denied access to the speech recognition system. There are many other factors that affect readiness, such as the quality of the sound sample, mood, background noise, and changes in the sound over time.
Founded in 2011, Datatang is a professional artificial intelligence data service provider and committed to providing high-quality training data and data services for global AI companies. Relying on own data resources, technical advantages and intensive data processing experiences, Datatang provides data services to 1000+ companies and institutions worldwide. Datatang entered Chinese stock market (NEEQ: 831428) in 2014 and became the first listed company in China’s artificial intelligence data service industry.
If you need data services, please feel free to contact us: email@example.com