R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published 10 days ago • 104
Continuous Speculative Decoding for Autoregressive Image Generation Paper • 2411.11925 • Published Nov 18, 2024 • 16
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought Paper • 2505.15431 • Published May 21 • 1
Re-ranking Reasoning Context with Tree Search Makes Large Vision-Language Models Stronger Paper • 2506.07785 • Published Jun 9 • 1
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis Paper • 2409.06135 • Published Sep 10, 2024 • 16
AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation Paper • 2408.01708 • Published Aug 3, 2024 • 4
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation Paper • 2312.06462 • Published Dec 11, 2023