Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation? Paper • 2508.19827 • Published 12 days ago • 31
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers Paper • 2504.00255 • Published Mar 31 • 1
Spectrum Projection Score: Aligning Retrieved Summaries with Reader Models in Retrieval-Augmented Generation Paper • 2508.05909 • Published Aug 8 • 20
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published 29 days ago • 90