Submitted by akhaliq 70 StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation · 10 authors 5
Submitted by akhaliq 46 VideoPoet: A Large Language Model for Zero-Shot Video Generation · 31 authors 2
Submitted by akhaliq 42 PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU · 4 authors 4
Submitted by akhaliq 29 DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation · 10 authors 2
Submitted by akhaliq 28 DreamTuner: Single Image is Enough for Subject-Driven Generation · 6 authors 6
Submitted by akhaliq 28 Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model · 5 authors 4
Submitted by akhaliq 27 Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis · 9 authors 2
Submitted by akhaliq 18 InstructVideo: Instructing Video Diffusion Models with Human Feedback · 10 authors 1
Submitted by akhaliq 15 TinySAM: Pushing the Envelope for Efficient Segment Anything Model · 8 authors 1
Submitted by akhaliq 14 Cached Transformers: Improving Transformers with Differentiable Memory Cache · 6 authors 1
Submitted by akhaliq 12 Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation · 12 authors 1
Submitted by akhaliq 11 Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models · 5 authors 1
Submitted by akhaliq 11 MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers · 8 authors
Submitted by akhaliq 10 Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models · 8 authors
Submitted by akhaliq 10 Mini-GPTs: Efficient Large Language Models through Contextual Pruning · 3 authors
Submitted by akhaliq 7 UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections · 6 authors
Submitted by akhaliq 6 Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting · 9 authors
Submitted by akhaliq 5 RadEdit: stress-testing biomedical vision models via diffusion image editing · 14 authors