Oumayma Essarhi's picture
1 6

Oumayma Essarhi

oumayma03

AI & ML interests

None yet

Recent Activity

Organizations

Arabic Machine Learning 's profile picture Mixed Arabic Datasets's profile picture Hugging Face Discord Community's profile picture

oumayma03's activity

upvoted an article 13 days ago
view article
Article

Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect

By atlasia and 2 others โ€ข
โ€ข 10
reacted to yuexiang96's post with ๐Ÿš€ 4 months ago
view post
Post
3073
๐ŸŒ Iโ€™ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries!

๐Ÿš€ Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! ๐ŸŒโœจ

https://neulab.github.io/Pangea/
https://arxiv.org/pdf/2410.16153

The Pangea family includes three major components:
๐Ÿ”ฅ Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance.

๐Ÿ“ PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. ๐Ÿ—‚๏ธ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. ๐ŸŽจ

๐Ÿ† PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU).

Check out more details: https://x.com/xiangyue96/status/1848753709787795679
liked a Space over 1 year ago