view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 7 days ago • 72
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance 15 days ago • 81
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 17 items • Updated 1 day ago • 36
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models 10 days ago • 97
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning Paper • 2508.09726 • Published Aug 13 • 15
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published Nov 10 • 14
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 24 days ago • 253
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 138
view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation Sep 16 • 17
An efficient probabilistic hardware architecture for diffusion-like models Paper • 2510.23972 • Published Oct 28 • 4
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29 • 45