Submitted by akhaliq 20 A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions · 6 authors 1
Submitted by akhaliq 15 SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance · 3 authors 1
Submitted by akhaliq 15 Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention · 5 authors 1
Submitted by akhaliq 13 FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection · 6 authors 2
Submitted by akhaliq 12 LIME: Localized Image Editing via Attention Regularization in Diffusion Models · 5 authors 4
Submitted by akhaliq 12 ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks · 11 authors 2
Submitted by akhaliq 11 UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation · 10 authors 1
Submitted by akhaliq 11 Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking · 12 authors 1
Submitted by akhaliq 10 VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation · 8 authors 1