SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training Paper • 2412.15649 • Published Dec 20, 2024
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs Paper • 2410.09503 • Published Oct 12, 2024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer Paper • 2401.03497 • Published Jan 7, 2024 • 1