NVIDIA has debuted a new experimental generative AI model, which it describes as “a Swiss Army knife for sound.” The model called Foundational Generative Audio Transformer Opus 1, or Fugatto, can take commands from text prompts and use them to create audio or to modify existing music, voice and sound files. It was designed by a team of AI researchers from around the world, and NVIDIA says that made the model’s “multi-accent and multilingual capabilities stronger.”
“We wanted to create a model that understands and generates sound like humans do,” said Rafael Valle, one of the researchers behind the project and a manager of applied audio research at NVIDIA. The company listed some possible real-world scenarios wherein Fugatto could be of use in its announcement. Music producers, it suggested, could use the technology to quickly generate a prototype for a song idea, which they can then easily edit to try out different styles, voices and instruments.
People could use it to generate materials for language learnings tools in the voice of their choice. And video game developers could use it to create variations of pre-recorded assets to fit changes in the game based on the players’ choices and actions. In addition, the researchers found that the model can accomplish tasks not part of its pre-training, with some fine-tuning. It could combine instructions that it was trained on separately, such as generating speech that sounds angry with a specific accent or the sound of birds singing during a thunderstorm. The model can generate sounds that change over time, as well, like the pounding of a rainstorm as it moves across the land.
NVIDIA didn’t say if it will give the public access to Fugatto, but the model isn’t the first generative AI technology that can create sounds out of text prompts. Meta previously released an open source AI kit that can create sounds from text descriptions. Google has its own text-to-music AI called MusicLM that people can access through the company’s AI Test Kitchen website.
Trending Products

AULA Keyboard, T102 104 Keys Gaming Keyboard and Mouse Combo with RGB Backlit Quiet Laptop Keyboard, All-Steel Panel, Waterproof Gentle Up PC Keyboard, USB Wired Keyboard for MAC Xbox PC Players

Acer Aspire 3 A315-24P-R7VH Slim Laptop computer | 15.6″ Full HD IPS Show | AMD Ryzen 3 7320U Quad-Core Processor | AMD Radeon Graphics | 8GB LPDDR5 | 128GB NVMe SSD | Wi-Fi 6 | Home windows 11 Residence in S Mode

MATX PC Case, 6 ARGB Followers Pre-Put in, Sort-C Gaming PC Case, 360mm Radiator Help, Tempered Glass Entrance & Facet Panels, Mid Tower Black Micro ATX Laptop Case

Wireless Keyboard and Mouse Combo, Lovaky 2.4G Full-Sized Ergonomic Keyboard Mouse, 3 DPI Adjustable Cordless USB Keyboard and Mouse, Quiet Click for Computer/Laptop/Windows/Mac (1 Pack, Black)

Lenovo Newest 15.6″ Laptop, Intel Pentium 4-core Processor, 15.6″ FHD Anti-Glare Display, Ethernet Port, HDMI, USB-C, WiFi & Bluetooth, Webcam (Windows 11 Home, 40GB RAM | 1TB SSD)

ASUS RT-AX5400 Twin Band WiFi 6 Extendable Router, Lifetime Web Safety Included, Immediate Guard, Superior Parental Controls, Constructed-in VPN, AiMesh Appropriate, Gaming & Streaming, Sensible Dwelling

AOC 22B2HM2 22″ Full HD (1920 x 1080) 100Hz LED Monitor, Adaptive Sync, VGA x1, HDMI x1, Flicker-Free, Low Blue Mild, HDR Prepared, VESA, Tilt Modify, Earphone Out, Eco-Pleasant

Logitech MK540 Superior Wi-fi Keyboard and Mouse Combo for Home windows, 2.4 GHz Unifying USB-Receiver, Multimedia Hotkeys, 3-12 months Battery Life, for PC, Laptop computer
