ACE-Step: A Leap Forward in Music Generation Foundation Models
2025-05-06
ACE-Step is a novel open-source foundation model for music generation that integrates diffusion-based generation with a Deep Compression AutoEncoder and a lightweight linear transformer. This approach overcomes the trade-offs between speed, coherence, and control found in existing LLM and diffusion models. ACE-Step generates up to 4 minutes of music in 20 seconds on an A100 GPU—15x faster than LLM baselines—while maintaining superior musical coherence and lyric alignment. It supports diverse styles, genres, and 19 languages, and offers advanced controls like voice cloning and lyric editing. The project aims to be the 'Stable Diffusion' of music AI, providing a flexible foundation for future music creation tools.
AI