Bugai's Collection - a BugaiL Collection

BugaiL 's Collections

Bugai's Collection

Bugai's Collection

updated about 8 hours ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published 9 days ago • 85
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published 12 days ago • 77
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

Paper • 2508.19247 • Published 10 days ago • 39
VibeVoice Technical Report

Paper • 2508.19205 • Published 10 days ago • 120
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Paper • 2508.18966 • Published 11 days ago • 55
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 3 days ago • 139
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published 3 days ago • 76
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published 6 days ago • 73
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published 5 days ago • 59
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling

Paper • 2509.00605 • Published 6 days ago • 32
Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published 7 days ago • 49
DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks

Paper • 2509.01396 • Published 5 days ago • 41