Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks Paper • 2510.08002 • Published Oct 9, 2025 • 23
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published May 26, 2025 • 30
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published Mar 10, 2025 • 61