Fugatto: Nvidia Unveils Its Groundbreaking AI Music Model

Updated on December 16, 2024

Nvidia Corporation has announced its entry into the field of generative artificial intelligence, it has joined the ranks of Meta Platforms Inc., Open AI and Runway AI Inc. with the introduction of a new model which has been designed to generate original music and audio based on human language prompts.

Fugatto, the name of the model is the short form for Foundational Generative Audio Transformer Opus 1. It is differentiated by its ability to modify human voices and can also produce unique sounds that are not achievable by any other models yet.

Nvidia focuses that Fugatto stands apart from other audio and music generation models because of its capacity to absorb and transform exciting sounds. For instance, it can take a piano melody and then convert it into vocal notes or can also reinterpret it making use of a different instrument like the violin.

In addition to this, it can also modify a human voice recording to change the accent and emotional tone of the performance. It might be misleading to claim that the sounds that are produced by Fugatto are completely original and fresh. Like all AI models, its outputs are generated from algorithms that can make use of existing data to fulfil the prompts of the users.

However, Nvidia maintains that Fugatto can create soundscapes which the users may not have encountered ever before, this is done by layering two different audio effects to generate something innovative.

In a demonstration video shared on YouTube, the company shows Fugatto’s ability to create the sound of the train gradually transforming into an orchestral piece along with other capabilities like altering cheerful voices to sound angry.

Nvidia claims that such features have not been observed ever in audio generation models, also along with basic prompt engineering, Fugatto provides the users with more precise controls for editing.

Vikhyaat Vivek

Tech Journalist