Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora Paper • 2507.01356 • Published Jul 2
SpeechAccentLLM: A Unified Framework for Foreign Accent Conversion and Text to Speech Paper • 2507.01348 • Published Jul 2
Multi-interaction TTS toward professional recording reproduction Paper • 2507.00808 • Published Jul 1
Teaching Audio-Aware Large Language Models What Does Not Hear: Mitigating Hallucinations through Synthesized Negative Samples Paper • 2505.14518 • Published May 20
Adaptability of ASR Models on Low-Resource Language: A Comparative Study of Whisper and Wav2Vec-BERT on Bangla Paper • 2507.01931 • Published Jul 2
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 3