Submitted by Ayushk44 78 Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models · 5 authors 3
Submitted by Fareso 55 GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression · 5 authors 8
Submitted by Zhaorun 49 AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases · 5 authors 3
Submitted by akhaliq 40 E5-V: Universal Embeddings with Multimodal Large Language Models · 9 authors 3
Submitted by kcz358 34 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models · 11 authors 4
Submitted by akhaliq 13 VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control · 12 authors 3
Submitted by akhaliq 8 Goldfish: Vision-Language Understanding of Arbitrarily Long Videos · 9 authors 2
Submitted by akhaliq 6 Audio Conditioning for Music Generation via Discrete Bottleneck Features · 5 authors 2
Submitted by akhaliq 6 Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections · 3 authors 2
Submitted by FreaxRuby 5 ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter · 8 authors 2
Submitted by Gootter12 5 AUITestAgent: Automatic Requirements Oriented GUI Function Testing · 8 authors 2
Submitted by xw-eric 4 NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models · 5 authors 2
Submitted by akhaliq 4 The Art of Saying No: Contextual Noncompliance in Language Models · 14 authors 2
Submitted by GaeLop 2 Zero-shot Cross-Lingual Transfer for Synthetic Data Generation in Grammatical Error Detection · 3 authors 4