The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published 7 days ago • 172
From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models Paper • 2508.13491 • Published 21 days ago • 58
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization Paper • 2508.14460 • Published 20 days ago • 80
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6 • 125
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning Paper • 2508.10433 • Published 26 days ago • 143
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 29 days ago • 72
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 176
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4 • 129
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published Jul 31 • 112
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published Jul 31 • 44
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13 • 80
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14 • 69
Should We Still Pretrain Encoders with Masked Language Modeling? Paper • 2507.00994 • Published Jul 1 • 78