Using Datasets for Machine Learning to Generate Music

AIGC (AI-Generated contend) is a new way of artificial intelligence-generated content, which is mainly assisted by AI technology to generate content. It is now considered to be the latest creative method after UGC and PGC, and is widely used in games, paintings, Content creation fields such as art and music. In the past, AIGC technology was mainly used in the field of word processing. In recent years, with the maturity and popularization of deep learning models, AIGC technology has made an essential breakthrough.

Recently, Microsoft applied for a patent on the intellectual property portal WIPO called “Artificial Intelligence Models for Composing Audio Scores” (ARTIFICIAL INTELLIGENCE MODELS FOR COMPOSING AUDIO SCORES).

This is an intelligent audio synthesis technology used to create sounds, music and other audio elements for various media formats such as movies, TV shows, games and even live broadcasts. The patent mentions dynamic moments in the game, suggesting it could create content that changes with the player’s actions. The abstract of the patent states that the algorithm can use visual, audio, and text features and cues (collectively referred to as “datasets”) to set parameters to instruct a large number of AI models to construct audio content.

In the patent content, Microsoft also introduced a variety of related artificial intelligence models, which can already analyze information such as human emotions, expressions, and current situations, and combine pictures, texts, etc. to automatically generate audio files. Can be done in line with the current screen.

Based on massive TTS project implementation experience and advanced TTS technology, Datatang provides high-quality, multi-scenario, multi-category music datasets for machine learning.

103 Chinese Mandarin Songs in Acapella — Female

It is recorded by Chinese professional singer, with sweet voice. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the song synthesis.

Besides, Datatang has rich sample sound resources, outstanding technical advantages and data processing experience, and supports personalized collection services for designated language, timbre, age, and gender. Meanwhile, Datatang supports data customization services such as audio segmentation, phoneme boundary segmentation (segmentation accuracy of 0.01 seconds), phonetic tagging, prosody tagging, part-of-speech tagging, pitch proofreading, rhythm tagging, and musical score production to fully meet customers’ diverse requirements.


If you need data services, please feel free to contact us at



Off-the-shelf AI training data, on-demand data collection & annotation services

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Off-the-shelf AI training data, on-demand data collection & annotation services