New Breakthrough in Intelligent Voice Technology, from Hey Siri to Siri

Recently, Apple announced that they are developing a new wake-up technology in order to bring a better experience to the Siri voice assistant. Once the development is successful, users will not need to wake up the voice assistant through “Hey Siri”. The change is expected to roll out sometime in 2023 or 2024.

Coincidentally, on November 1, Xiaodu Technology, a subsidiary of Baidu, released the 15.6-inch ultra-large screen “Tian Tian Zi You Ping” tablet for the first time. In this product, Xiaodu adopts the “two-word One-short technology”. Users only need to say “Xiaodu” plus any word to wake up the machine, no longer need to say the traditional four-character wake-up word “Xiaodu Xiaodu”.

The wake-up word of the intelligent voice assistant is like the name of each of us. “Siri” is also the name of Apple’s intelligent voice assistant. Let it know that the user is calling himself.

Voice wake-up is also called keyword spotting (KWS), that is, a specific segment of the speaker is detected in real time in the continuous speech flow, and this specific segment is the wake-up word. Generally speaking, if the voice assistant is not manually disabled, the voice assistant will reside in the background for a long time as a system-level service. However, as a function that requires a lot of AI computing power, the intelligent voice assistant requires a lot of performance overhead in the working state, and it will also increase power consumption accordingly.

Faced with this problem, the developer came up with a low-power coprocessor that can be awakened by voice to monitor the microphone in real time. , it will switch the voice assistant from the sleep state to the working state. This can greatly reduce the battery life pressure on the device, and it will also prevent the voice assistant, which is always working, from processing audio information that is not sent to itself.

From the wake-up word of Apple’s smart voice assistant changing “Hey Siri” to “Siri”, to Xiaodu Technology’s no longer saying the four-character wake-up word, although such a change seems very small, such a small change is It reflects the progress and breakthrough of AI voice technology. In the field of intelligent voice assistants, there will be some problems with long syllables or short syllables. The shorter the syllable, it is easy to wake up falsely, and the longer the syllable will affect the user experience. At the very beginning, the “Hey” in the wake-up word “Hey Siri” of Apple’s smart voice assistant was to add syllables so as to increase the accuracy of system monitoring.

The omission of “Hey” means that Apple’s intelligent voice technology has made great progress, and it has been able to achieve the level of judging the user’s intention with only one word. With a high probability, Apple uses voiceprint recognition technology to achieve directional human voice separation, and then uses a convolutional neural network with a voiceprint recognition encoder to accurately capture the voice of the target user in a complex acoustic environment.

About Datatang

Founded in 2011, Datatang is a professional artificial intelligence data service provider and committed to providing high-quality training data and data services for global AI companies. Relying on own data resources, technical advantages and intensive data processing experiences, Datatang provides data services to 1000+ companies and institutions worldwide. Datatang entered Chinese stock market (NEEQ: 831428) in 2014 and became the first listed company in China’s artificial intelligence data service industry.

If you need data services, please feel free to contact us:



Off-the-shelf AI training data, on-demand data collection & annotation services

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Off-the-shelf AI training data, on-demand data collection & annotation services