Improving Your AI Models with High-Quality Conversation Speech Data

Currently, most of the speech data on the market is reading. However, the interaction between humans and machines should not be just a simple dialogue or command control of question and answer, but to understand the context of the language and recognize human’s speech and emotion and make corresponding feedback.

With the improvement of user experience brought about by technological breakthroughs, conversational voice interaction has become the focus of AI giants. Google, Amazon, Alibaba, Tencent, Baidu, Xiaomi, etc. have launched smart speakers, smart assistants, smart customer services and smart robots that support multiple rounds of continuous dialogue. The continuous dialogue ability of the AI system will trigger the technical change in industries such as finance, education, Internet, transportation, mobile communications, and manufacturing.

As a world’s leading AI data service provider, Datatang has a series of natural dialogue speech datasets in dozens of languages, including Mandarin, Chinese dialects, English, French, German, Russian, Spanish, Japanese, Korean, Hindi, Thai, etc. The datasets cover a variety of pronunciation habits and characteristics, accent severity, and the distribution of speakers.

1,351 Hours — Mandarin Conversational Speech Data by Mobile Phone and Voice Recorder

1,950 speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. The sentence accuracy is 97%.

500 Hours — Minnan Dialect Conversational Speech Data by Mobile Phone

The dataset contains 500 hours of Minnan dialect conversation speech data. It’s recorded by local speakers from Xiamen, Zhangzhou, Quanzhou. The speakers start the conversation around a familar topic, to ensure the smoothness and nature of the conversation. The sentence accuracy is over 95%.

1,000 Hours — American English Natural Dialogue Speech Data

The dataset contains 1,000 hours of American English conversation speech data. It’s recorded by 2,000 native speakers. The speakers start the conversation around a familar topic, to ensure the smoothness and nature of the conversation. The sentence accuracy is over 95%. ‍

500 Hours — French Conversational Speech Data by Mobile Phone

The dataset contains 500 hours of French conversation speech data. It’s recorded by about 1,000 native speakers. The speakers start the conversation around a familiar topic, to ensure the smoothness and nature of the conversation. The sentence accuracy is over 95%. ‍

500 Hours — Spanish Conversational Speech Data by Mobile Phone

The dataset contains 500 hours of Spanish conversation speech data. It’s recorded by about 1,000 native speakers. The speakers start the conversation around a familiar topic, to ensure the smoothness and nature of the conversation. The sentence accuracy is over 95%.

If the above data cannot meet the needs of your current research, Datatang also provides data customization services for specific groups of people, specific scenarios, and specific languages to meet customers’ diversified data needs.

End

If you need data services, please feel free to contact us: info@datatang.com

Off-the-shelf AI training data, on-demand data collection & annotation services