view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 10 days ago • 62
view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 13 days ago • 50
view article Article Vision Language Model Alignment in TRL ⚡️ By sergiopaniego and 4 others • 14 days ago • 69
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 21 days ago • 63
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity Paper • 2507.21848 • Published 23 days ago • 7
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface Paper • 2507.18546 • Published 28 days ago • 18
ULD Loss (Universal LLMs Distillation) Collection The ULD loss, based on optimal transport, enables distillation across different LLM families without requiring shared tokenizers. • 14 items • Updated Jul 15 • 2
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 87
ThinkPRM Collection Process Reward Models that Think -- https://arxiv.org/abs/2504.16828 • 8 items • Updated 23 days ago • 3
view article Article Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure By jcudit • Jul 8 • 10
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • Jul 9 • 649
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 631
view article Article Bringing Fusion Down to Earth: ML for Stellarator Optimization By cgeorgiaw • Jul 2 • 73
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 125
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • Jun 26 • 115
view article Article xLSTM-based time series model TiRex significantly outperforms competing models in forecasting accuracy By BobWue • Jun 4 • 12