Submitted by akhaliq 20 SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing · 5 authors 3
Submitted by akhaliq 20 G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model · 11 authors 2
Submitted by akhaliq 19 GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning · 7 authors 1
Submitted by akhaliq 19 M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts · 8 authors 1
Submitted by akhaliq 15 MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising · 7 authors 1
Submitted by akhaliq 14 A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise · 22 authors 3
Submitted by akhaliq 11 MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance · 5 authors 1
Submitted by akhaliq 11 Silkie: Preference Distillation for Large Visual Language Models · 9 authors 1
Submitted by akhaliq 8 Catwalk: A Unified Language Model Evaluation Framework for Many Datasets · 10 authors 1
Submitted by akhaliq 7 Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models · 4 authors 1
Submitted by akhaliq 6 Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method · 5 authors 2
Submitted by akhaliq 6 VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder · 7 authors 1
Submitted by akhaliq 5 GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis · 7 authors 1