Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASR Paper • 2409.15869 • Published Sep 24, 2024
WhisperNER: Unified Open Named Entity and Speech Recognition Paper • 2409.08107 • Published Sep 12, 2024
Equivariant Architectures for Learning in Deep Weight Spaces Paper • 2301.12780 • Published Jan 30, 2023
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading Paper • 2306.03258 • Published Jun 5, 2023 • 1
Beyond Transcription: Mechanistic Interpretability in ASR Paper • 2508.15882 • Published 17 days ago • 83