After the generation of texts, images and 3D objects, a new kind of artificial intelligence is being developed. Called MusicLM, the model developed by Google would be able to create music independently. Researchers from the North American firm detailed their progress in a scientific article published on January 26, 2023 and relayed by TechCrunch.
Music in a few sentences
Without being the first AI model to claim to create music on demand, MusicLM seems however the most successful. This program would thus be able to generate coherent music for several minutes. To work, the tool only needs one piece of data: a textual description. With a simple phrase, it is then possible to create dedicated background music or a film melody, for example. The model would be much more advanced than its counterparts with high-level fidelity to user requests, as well as great audio quality, according to Google researchers.
For the occasion, the Mountain View company has set up a demonstration site. It is possible to listen to music generated virtually by MusicLM. Even if the hits are not worthy of a beatmaker beginner, the first examples are no less stunning.
To reproduce the soundtrack of an arcade game, the researchers gave only a few words to artificial intelligence: “The main soundtrack of an arcade game. It’s rhythmic and catchy with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal hits or drum rolls.” A fairly well-produced piece, with a relative musicality over the length, is then generated for 30 s.
280,000 hours of music studied
MusicLM is able to produce music based solely on melodies that are hummed, sung, whistled or played on an instrument. An image/caption pair can even play a tune. To train the AI in deep learning, the researchers used a library of more than 280,000 h of music, training a broadly diverse database to generate songs that were both coherent and complex.
Only a few technical limitations were noted by the researchers. Thus, the model would misunderstand the negations within the text and would struggle to faithfully reproduce the user’s temporal instructions. For the future, the engineers plan to develop speech generation and improve AI text comprehension. Finally, these experts also aim to better slice the audio to more accurately produce an intro, verses, and chorus.
For now, Google does not provide a public release date for MusicLM.