RL - a dearaj23 Collection

dearaj23 's Collections

memory

RL

LLM

CoT

survey

RL

updated Oct 20

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 158
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 106