ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published Oct 27 • 121
InteractComp: Evaluating Search Agents With Ambiguous Queries Paper • 2510.24668 • Published Oct 28 • 97
VisJudge-Bench: Aesthetics and Quality Assessment of Visualizations Paper • 2510.22373 • Published Oct 25 • 14
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks Paper • 2511.15065 • Published Nov 19 • 74
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 301