How Automated Data Labeling Tools Fuels Autonomous Vehicles

In early 2021, Tesla disclosed that it was hiring a team of data labelers at its New York-based Gigafactory to support image labeling and help train Autopilot/FSD neural networks. According to previously released data, Tesla has a total of nearly 1,000 people in the data labeling team.

An AI chief at Tesla revealed last year that the company has only “dozens” of engineers working on neural networks, but that there is a “huge” team working on labels. On the one hand, manual high-quality data labeling are still the groundwork, and on the other hand, automated data labeling is also a trend in order to process the large amount of data collected by fleets. The annotation team will interact with computer vision engineers on the Autopilot team to improve the design of internal annotation tools. At the same time, the annotation team will gain basic computer vision and machine learning knowledge to better understand how algorithms work for data labeling.

In fact, the message behind this is that data labeling is not a simple “bounding box labeling”, nor is it purely labeling the object one by one. “This method is time-consuming and expensive.” Some industry insiders pointed out that the result data delivered by traditional outsourcers has been repaired many times, but it still cannot meet the accuracy required by customers.

It’s predictable that the next wave in data labeling track is the automated tools. The reason is that with the gradual increase in the scale of new vehicles equipped with data collection and return functions, the processing of huge data has become a rigid need. This means that labeling efficiency and accuracy determine the iteration speed of computer vision and multi-integration perception technologies. “High-quality data is a decisive factor in a sense.” In the eyes of industry professionals, high-quality and efficient data is also the key to speeding up the function development cycle.

As the world’s leading AI data service provider, Datatang has also launched a self-developed data annotation tool with built-in ML-assisted pre-recognition function, which truly realizes semi-automatic data processing and can effectively improve the per capita efficiency by more than 30%. Nearly 30 sets of annotation tools can be flexibly applied to the annotation of multiple types of data such as voice, image, 3D point cloud, and text, and have been successfully applied in the implementation of nearly 5,000 projects in 11 years.

For example, missing labeling is a serious labeling error. Datatang has built-in ground detection algorithms and automatic color rendering in the tool. When marking, you can judge the marked objects according to the color to reduce missing markings. In addition, this tool also has built-in interpolation algorithm pre-marking function. If the target ID is marked in the first and fifth frames, the position of the intermediate frame will be automatically marked, just check or fine-tune the position.

In addition to data annotation tools, Datatang also provides ready-to-go training datasets. Datatang has 65,000 hours of in-cabin speech datasets and more than 100 sets of computer vision datasets, helping our customers with the development of autonomous driving technology.

● In Cabin Speech Datasets

Japanese Speaking English Speech Data by Mobile Phone

The dataset is recorded by native Japanese speakers, balanced for gender. The recording corpus is rich in content, and it covers a wide domain such as generic command and control, human-machine interaction, smart home and in-car.

Hindi Conversational Speech Data by Mobile Phone

About 1,000 speakers participated in the recording and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene.

British English Speech Data by Mobile Phone

The data is recorded by native British speakers. The recording contents cover many categories such as generic, interactive, in-car and smart home.

Chinese-English Mixed Average Tone Speech Synthesis Corpus-Customer Service

It is recorded by Chinese native speakers, customer service text, and the syllables, phonemes and tones are balanced. Professional phonetician participates in the annotation.

● Computer Vision Datasets

Multi-race — Driver Behavior Collection Data

The data includes multiple ages, multiple time periods and multiple races (Caucasian, Black, Indian). The driver behaviors includes dangerous behavior, fatigue behavior and visual movement behavior.

Passenger Behavior Recognition Data

The data includes multiple age groups, multiple time periods and multiple races (Caucasian, Black, Indian). The passenger behaviors include passenger normal behavior, passenger abnormal behavior(carsick behavior, sleepy behavior, lost items behavior).

50 Types of Dynamic Gesture Recognition Data

The data covers males and females. The age distribution ranges from teenager to senior. The data diversity includes multiple scenes, 50 types of dynamic gestures, 5 photographic angles, multiple light conditions, different photographic distances.

In addition, Datatang also supports on demand data collection services for customers, such as cockpit personnel behavior collection, 2D street view data collection, and multi-language and multi-group voice collection in driving scenarios.


If you need data services, please feel free to contact us:

Off-the-shelf AI training data, on-demand data collection & annotation services

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium


An AI System to Help Predict ED Patient Deterioration

The Martian Chronicles — Where Deep Learning meets Global Collaboration

Can’t get started, no momentum: The 6 Steps between you and utilizing cutting-edge technology

4 features of a possible AI data container (derived from the 6 types of general data containers).

Further Understanding the Thermodynamic Nature of Intelligence

How to collect dataset from 50 thousand images: Neatsy startup experience

When Identity Becomes an Algorithm

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Off-the-shelf AI training data, on-demand data collection & annotation services

More from Medium

Human Brain vs Artificial Intelligence Systems

Akira’s Machine Learning News — Issue #36

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

An example of a reasoning shortcut. If the model was truly following the reasoning process, it should’ve addressed that it is unable to answer.

Explainable AI Framework Comparison