PicoAudio2: Temporal Controllable Text-to-Audio Generation with Natural Language Description Paper • 2509.00683 • Published Aug 31
UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities Paper • 2509.24391 • Published Sep 29
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29