Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 29 days ago • 100
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 15 days ago • 93
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 15 days ago • 64
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 14 days ago • 131
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published 14 days ago • 18
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 13 days ago • 98
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Paper • 2512.23959 • Published 15 days ago • 97
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 8 days ago • 92
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 5 days ago • 145
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 7 days ago • 97