Datatang’s Children Speech Data Helps Build the Best Voice Assistant for Kids
At present, in the process of speech recognition research, it is found that the speech database established by adults cannot well understand children’s speech, and many problems exist in recognition errors.
Because children’s language has different voice and language characteristics from adults due to its voice and articulation, children’s speech recognition has natural technical difficulties. In addition, children are not good at interacting with machines in a way that they can understand. Whether it is a more friendly interface or a more intelligent voice assistant, the recognition effect is not satisfactory.
In order to solve this problem, Datatang has developed over 10,000 hours children speech data. It’s is recorded by children, and the recording contents conforms to the characteristics of children’s speech.
It is recorded by 219 American children native speakers. The recording texts are mainly storybook, children’s song, spoken expressions, etc. 350 sentences for each speaker. Each sentence contain 4.5 words in average. Each sentence is repeated 2.1 times in average. The recording device is hi-fi Blueyeti microphone. The texts are manually transcribed.
It collects 201 British children. The recordings are mainly children textbooks, storybooks. The average sentence length is 4.68 words and the average sentence repetition rate is 6.6 times. This data is recorded by high fidelity microphone. The text is manually transcribed with high accuracy.
Children read English audio data, covering ages from preschool (3–5 years old) to post-school (6–12 years old) , with children’s speech features; content accurately matches children’s actual scenes of speaking English. It provides data support for children’s smart home, automatic speech recognition and oral assessment in intelligent education scene.
Mobile phone captured audio data of Chinese children, with total duration of 3,255 hours. 9,780 speakers are children aged 6 to 12, with accent covering seven dialect areas; the recorded text contains common children languages such as essay stories, numbers, and their interactions on cars, at home, and with voice assistants, precisely matching the actual application scenes. All sentences are manually transferred with high accuracy.
500 Hours — Korean Children Speech Data by Mobile Phone
The dataset is recorded by local Korean children’s personnel. About 1,500 people participated in the recording, with authentic accents. 500 hours of speech data collected by Korean children’s mobile phones can be used for speech recognition/language model training or algorithm research.
If you want to know more details about the datasets or how to acquire, please feel free to contact us: email@example.com.