view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 12 days ago • 75
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published 25 days ago • 46
view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 Nov 21, 2025 • 25
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper • 2511.14295 • Published Nov 18, 2025 • 72
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR Paper • 2509.18174 • Published Sep 17, 2025 • 129
Qari-OCR: A High-Accuracy Model for Arabic Optical Character Collection 𝐵𝑢𝑖𝑙𝑡 𝑜𝑛 𝑡ℎ𝑒 𝑝𝑜𝑤𝑒𝑟𝑓𝑢𝑙 𝑄𝑤𝑒𝑛2 𝑉𝐿 2𝐵 𝑎𝑛𝑑 𝑓𝑖𝑛𝑒-𝑡𝑢𝑛𝑒𝑑 𝑜𝑛 𝑎𝑛 𝐴𝑟𝑎𝑏𝑖𝑐 𝑂𝐶𝑅 𝑑𝑎𝑡𝑎𝑠𝑒𝑡, 𝑄𝑎𝑟𝑖 𝑣0.1 𝑑𝑒 • 7 items • Updated Jun 25, 2025 • 12
Pearl Collection PEARL: A Multimodal Culturally-Aware Arabic Instruction Dataset • 4 items • Updated Oct 27, 2025 • 6
QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation Paper • 2506.02295 • Published Jun 2, 2025 • 10
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment Paper • 2507.20984 • Published Jul 28, 2025 • 58
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper • 2507.08800 • Published Jul 11, 2025 • 81
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23, 2025 • 81