PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper • 2502.14282 • Published 3 days ago • 14
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 3 days ago • 147
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 10 days ago • 38
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published 11 days ago • 53
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation Paper • 2502.09411 • Published 10 days ago • 16
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges Paper • 2502.08680 • Published 11 days ago • 11
CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published 10 days ago • 12
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Paper • 2502.08468 • Published 11 days ago • 13
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published 10 days ago • 16
Exploring the Potential of Encoder-free Architectures in 3D LMMs Paper • 2502.09620 • Published 10 days ago • 26
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published 10 days ago • 27
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Paper • 2502.09082 • Published 10 days ago • 28
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published 10 days ago • 31
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published 10 days ago • 30