Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 2 days ago • 6
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 3 days ago • 19
CooperBench: Why Coding Agents Cannot be Your Teammates Yet Paper • 2601.13295 • Published 12 days ago • 3
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 4 days ago • 15
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published 5 days ago • 47
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 3 days ago • 112
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning Paper • 2601.18150 • Published 5 days ago • 5
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published 3 days ago • 21
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 4 days ago • 72
SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback Paper • 2601.18202 • Published 5 days ago • 8
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published 9 days ago • 89
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper • 2601.16746 • Published 8 days ago • 86
Endless Terminals: Scaling RL Environments for Terminal Agents Paper • 2601.16443 • Published 8 days ago • 16
DSGym: A Holistic Framework for Evaluating and Training Data Science Agents Paper • 2601.16344 • Published 9 days ago • 10