Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Paper • 2512.17909 • Published 8 days ago • 36
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning Paper • 2510.14958 • Published Oct 16 • 22
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving Paper • 2510.12796 • Published Oct 14 • 12
Generic Token Compression in Multimodal Large Language Models from an Explainability Perspective Paper • 2506.01097 • Published Jun 1 • 3
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 47