Why Conversational Speech Recognition Will Be the Future of Voice Technology

At present, the word error rate of global intelligent voice enterprises in the reading style voice basically remains the same level. With the increase of vertical application scenarios, more and more enterprises have begun to increase R&D investment in the technology of conversational speech recognition.

Over the years, speech recognition technology has received increasing attention. It is becoming a common part of personal life associated with computers, smartphones and smart devices.

Rapid growth of voice devices, increasing consumer demand for smart devices, and integration of in-vehicle infotainment systems are the key factors driving the growth of the voice recognition market. In addition, the increasing use of AI in automotive, healthcare, and consumer electronics has increased the demand for voice-enabled devices. Meanwhile, the growing demand for voice applications in devices such as smart speakers, consumer electronics, smart wearables, connected cars, smart home, and healthcare is one of the key factors driving the voice recognition market.

According to the latest report released by Meticulous Market Research, the speech recognition market will reach 26.79 billion US dollars by 2025, and will continue to grow at a compound annual growth rate of 17.2% from 2019 to 2025.

High-quality training data is the basis of good AI. Datatang has off-the-shelf 200,000 hours of speech data, including nearly 40,000 hours of natural dialogue speech data, including Mandarin Chinese, Chinese dialects, English, Japanese, Korean, Hindi, Vietnamese, Arabic, Spanish, French, German, Italian, etc.

All the audios have passed strict manual transcription and quality inspection. The text content, the start and end time points of valid sentences, and the identity of the recorder are annotated, and the sentence accuracy rate is 95%.

Korean Conversational Speech Data by Mobile Phone

About 600 Korean speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields.

American English Natural Dialogue Speech Data

The dataset contains 1,000 hours of American English conversation speech data. It’s recorded by 2,000 native speakers. The speakers start the conversation around a familar topic, to ensure the smoothness and nature of the conversation.

French Conversational Speech Data by Mobile Phone

The dataset contains 500 hours of French conversation speech data. It’s recorded by about 1,000 native speakers. The speakers start the conversation around a familiar topic, to ensure the smoothness and nature of the conversation.

German Conversational Speech Data by Mobile Phone

Nearly 300 speakers participated in the recording and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene.

Mandarin Mobile Telephony Conversational Speech Collection Data

About 5,000 speakers participated in the recording and conducted face-to-face communication in a natural way. No topics are specified, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene.

Cantonese Conversational Speech Data

Nearly 1,000 local Cantonese speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene.

Datatang’s conversational speech datasets have helped more than 100 companies worldwide and successfully applied to multiple scenarios such as intelligent customer service, intelligent conferences, and automatic generation of video subtitles.

AI is a great historical process. Since its inception, it has ushered in the era of large-scale implementation of artificial intelligence. In the future, with the simultaneous development of technologies such as 5G, more and more speech recognition application scenarios will achieve the barrier-free communication between different languages, different races, and different regions.

End

If you need data services, please feel free to contact us: info@datatang.com.

--

--

--

Off-the-shelf AI training data, on-demand data collection & annotation services

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Can AI be biased?

Award-winning Driver Monitoring System helps improve safety

Aimbroad signed MOU with Andus.

Publishing One of the World’s First A.I. Written Books — The Poetic Prototype

내가 바라는 미래 사회 — Imagining 2050 in Korea

If AI Goes Bad

10 Jobs That Are Safe in an AI World

Designing trust into AI systems

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Datatang

Datatang

Off-the-shelf AI training data, on-demand data collection & annotation services

More from Medium

European chatbot conference: 10 talks that I’m going to rewatch

Artificial Intelligence in E-commerce

Artificial Intelligence in E-commerce

Conversational Cookie Coach Avatar by Nestle

With Conversations Alexa Wants To Solve Dialog State Management