Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published 12 days ago • 63
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published 18 days ago • 17
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 25 days ago • 32
Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published 30 days ago • 29
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Paper • 2510.18632 • Published Oct 21 • 21
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56