Submitted by di-zhang-fdu 34 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning · 13 authors 2
Submitted by rizavelioglu 26 TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models · 4 authors 7
Submitted by ChengyouJia 23 ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting · 6 authors 3
Submitted by Gynjn 15 SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting · 8 authors 2
Submitted by kjm981995 12 Free$^2$Guide: Gradient-Free Path Integral Control for Enhancing Text-to-Video Generation with Large Vision-Language Models · 3 authors 2
Submitted by Owos 4 AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset · 26 authors 3