Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published 15 days ago • 80
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published 5 days ago • 38
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published 16 days ago • 61
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published 28 days ago • 81
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published 17 days ago • 126
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16 • 41
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization Paper • 2505.23387 • Published May 29 • 9
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 631
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published Jul 8 • 73
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8 • 41
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published Jul 1 • 10
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published Jun 30 • 48
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25 • 46
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated Jul 3 • 110
AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 7 days ago • 14