S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published 5 days ago • 22
LLM-based User Profile Management for Recommender System Paper • 2502.14541 • Published 3 days ago • 4
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models Paper • 2502.14191 • Published 3 days ago • 4
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published 3 days ago • 6
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 3 days ago • 31
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models Paper • 2502.14834 • Published 3 days ago • 22
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper • 2502.14282 • Published 3 days ago • 14
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 3 days ago • 145
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published 3 days ago • 13
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published 3 days ago • 7
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers Paper • 2502.14377 • Published 3 days ago • 10
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 3 days ago • 87
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 98
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Paper • 2502.13791 • Published 4 days ago • 5
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Paper • 2502.13533 • Published 4 days ago • 6