VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published 7 days ago • 61
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48
The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer Markets Paper • 2506.00073 • Published May 29 • 2
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models Paper • 2505.24025 • Published May 29 • 27
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 418
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements Paper • 2502.12904 • Published Feb 18 • 2
What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs Paper • 2410.10863 • Published Oct 7, 2024 • 1
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study Paper • 2411.02462 • Published Nov 4, 2024 • 10
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3, 2024 • 49