Submitted by akhaliq 39 Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 · 10 authors 2
Submitted by xhyandwyy 34 mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models · 9 authors 2
Submitted by akhaliq 24 UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling · 6 authors 2
Submitted by akhaliq 18 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities · 12 authors 4
Submitted by akhaliq 10 Kalman-Inspired Feature Propagation for Video Face Super-Resolution · 3 authors 3
Submitted by IAMJB 9 BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion · 1 authors 2
Submitted by IAMJB 8 MooER: LLM-based Speech Recognition and Translation Models from Moore Threads · 8 authors 2
Submitted by IAMJB 6 Generating novel experimental hypotheses from language models: A case study on cross-dative generalization · 2 authors 1