EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 20 days ago • 113
Let Multimodal Embedders Learn When to Augment Query via Adaptive Query Augmentation Paper • 2511.02358 • Published Nov 4 • 4
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12 • 68
Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation Paper • 2510.19592 • Published Oct 22 • 12
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers Paper • 2407.09941 • Published Jul 13, 2024 • 1
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published May 19, 2024 • 56