Canary Collection A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 5 items • Updated 6 days ago • 25
ParallelSearch Collection Checkpoints for our paper "ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning" • 3 items • Updated 13 days ago • 5
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated 6 days ago • 195
view article Article Vision Language Model Alignment in TRL ⚡️ By sergiopaniego and 4 others • 14 days ago • 69
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published 13 days ago • 114
👁️ LFM2-VL Collection LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 6 items • Updated 2 days ago • 31
Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents Paper • 2502.04223 • Published Feb 6 • 11
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 10 days ago • 60
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published 11 days ago • 82
view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 13 days ago • 50
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published 13 days ago • 151
Vision Language Leaderboards Collection This collection has all the vision language leaderboards. • 7 items • Updated Aug 24, 2024 • 21