Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles Paper • 2505.23590 • Published May 29 • 25
How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning Paper • 2505.24273 • Published May 30 • 4
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers Paper • 2506.02528 • Published Jun 3 • 15
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers Paper • 2506.03065 • Published Jun 3 • 27
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models Paper • 2507.07104 • Published Jul 9 • 45