The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published 4 days ago • 144
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published 4 days ago • 76
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published 4 days ago • 101
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published 9 days ago • 103
AWorld: Orchestrating the Training Recipe for Agentic AI Paper • 2508.20404 • Published 10 days ago • 37
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning Paper • 2508.18966 • Published 11 days ago • 56
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning Paper • 2508.20751 • Published 9 days ago • 85
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies Paper • 2508.20072 • Published 10 days ago • 28
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 12 days ago • 179
AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions Paper • 2508.16402 • Published 15 days ago • 14
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks Paper • 2508.08240 • Published 26 days ago • 43
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published 15 days ago • 131
Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation Paper • 2508.13998 • Published 18 days ago • 16
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published about 1 month ago • 123
ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning Paper • 2508.10419 • Published 23 days ago • 71
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published 24 days ago • 52