Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28, 2025 • 38
Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28, 2025 • 38
Video Virtual Try-on with Conditional Diffusion Transformer Inpainter Paper • 2506.21270 • Published Jun 26, 2025
UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception Paper • 2509.23760 • Published Sep 28, 2025 • 1
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model Paper • 2510.20803 • Published Oct 23, 2025 • 9
LumiSculpt: A Consistency Lighting Control Network for Video Generation Paper • 2410.22979 • Published Oct 30, 2024 • 2
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 73 • 3
Mimir: Improving Video Diffusion Models for Precise Text Understanding Paper • 2412.03085 • Published Dec 4, 2024 • 12
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 56
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction Paper • 2505.02471 • Published May 5, 2025 • 15
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 73
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 56
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 73