A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 249
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating Paper • 2412.18424 • Published Dec 24, 2024 • 1
From System 1 to System 2: A Survey of Reasoning Large Language Models Paper • 2502.17419 • Published Feb 24 • 3
SOLIDGEO: Measuring Multimodal Spatial Math Reasoning in Solid Geometry Paper • 2505.21177 • Published May 27 • 1
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression Paper • 2506.02678 • Published Jun 3 • 5
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published Jun 10 • 15