Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation Paper • 2508.20470 • Published 10 days ago • 64
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 74
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published Mar 13 • 80
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8 • 138
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Paper • 2408.15998 • Published Aug 28, 2024 • 88
DrawingSpinUp: 3D Animation from Single Character Drawings Paper • 2409.08615 • Published Sep 13, 2024 • 21
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Paper • 2409.07452 • Published Sep 11, 2024 • 22
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization Paper • 2408.15914 • Published Aug 28, 2024 • 25
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding Paper • 2409.06210 • Published Sep 10, 2024 • 27
B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests Paper • 2409.08692 • Published Sep 13, 2024 • 28
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 39
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction Paper • 2410.17247 • Published Oct 22, 2024 • 48
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation Paper • 2411.04709 • Published Nov 5, 2024 • 27
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper • 2411.07126 • Published Nov 11, 2024 • 31