JamendoLyrics

community

AI & ML interests

Lyrics transcription and alignment

This is the home of the JamendoLyrics and Jam-ALT datasets for lyrics alignment and transcription research.

JamendoLyrics MultiLang

An automatic lyrics alignment (ALA) benchmark, JamendoLyrics MultiLang contains 79 songs with different genres and languages (🇬🇧🇪🇸🇩🇪🇫🇷) along with lyrics that are time-aligned on the word and line level to the audio.

🤗 dataset: jamendolyrics/jamendolyrics
📄 paper: Similarity-based Audio-Lyrics Alignment of Multiple Languages (ICASSP 2023)

Jam-ALT

Jam-ALT is an automatic lyrics transcription (ALT) benchmark. Derived from JamendoLyrics, it has been thoroughly revised according to industry standards for lyrics transcription and formatting, and includes proper punctuation and capitalization (PnC). However, it currently does not include timing information.

The benchmark is also accompanied by the alt-eval Python package for computing transcription evaluation metrics.

🤗 dataset: jamendolyrics/jam-alt
📄 paper: Lyrics Transcription for Humans: A Readability-Aware Benchmark (ISMIR 2024)
🧑‍💻 code: audioshake/alt-eval
🌐 website: audioshake.github.io/jam-alt

models

None public yet

datasets

None public yet