Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 8 days ago • 78
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 18 days ago • 35
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published 22 days ago • 149
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 24 days ago • 232
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published about 1 month ago • 154
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published 25 days ago • 88
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 25 days ago • 93
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17 • 136
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17 • 133
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14 • 164
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12 • 201
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper • 2511.06307 • Published Nov 9 • 51
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9 • 131
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published Nov 11 • 33
IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction Paper • 2511.07327 • Published Nov 10 • 76
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7 • 53