view article Article Data exploration and filtering with Nomic Atlas By visheratin • Mar 22, 2024 • 5
Arabic (MSA) Summarization Models & Datasets Collection A collection of models (and the dataset used to train them) that are trained for summarizing arabic text. • 5 items • Updated 3 days ago • 1
Translation Models & Datasets Collection English to Moroccan darija (ary) models • 15 items • Updated 3 days ago • 1
Moroccan Darija Datasets Collection A collection of all available datasets for pretraining LLMs • 12 items • Updated 3 days ago • 1
Moroccan Darija Embeddings Models & Datasets Collection Sentence and word embedding models for Moroccan darija (ary) • 7 items • Updated 3 days ago • 1
Moroccan Darija LLMs Collection Language Models that speaks Moroccan darija (ary) • 9 items • Updated 3 days ago • 1
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 148
view article Article Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect By atlasia and 2 others • 13 days ago • 10
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection Paper • 2408.04284 • Published Aug 8, 2024 • 26
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 19 days ago • 190
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models Paper • 2306.07691 • Published Jun 13, 2023 • 7
view article Article Finding Moroccan Arabic (Darija) in Fineweb 2 By omarkamali and 3 others • Dec 8, 2024 • 22
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz and 4 others • Jan 10 • 28