In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published Nov 24 • 30
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11 • 37
Discrete Diffusion LLM & MLLM Collection An collection of research/models in discrete diffusion large language and multimodal models • 57 items • Updated Jun 17 • 4
Discrete Diffusion in Large Language and Multimodal Models: A Survey Paper • 2506.13759 • Published Jun 16 • 43
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Paper • 2505.16990 • Published May 22 • 22
EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models Paper • 2506.01667 • Published Jun 2 • 21