Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 1 day ago • 167
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published 9 days ago • 128
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 2 days ago • 142
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 9 days ago • 75
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration Paper • 2602.03786 • Published 8 days ago • 84
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 12 days ago • 95
Closing the Loop: Universal Repository Representation with RPG-Encoder Paper • 2602.02084 • Published 9 days ago • 82
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 10 days ago • 91
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 12 days ago • 170
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 9 days ago • 125
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 13 days ago • 150
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 11 days ago • 268
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published 14 days ago • 98
daVinci-Dev: Agent-native Mid-training for Software Engineering Paper • 2601.18418 • Published 16 days ago • 124
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published 14 days ago • 175