Submitted by akhaliq 181 The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding · 8 authors 3
Submitted by geonp 139 InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU · 4 authors 6
Submitted by Agorium 39 Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation · 4 authors 2
Submitted by Ray2333 32 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents · 13 authors 2
Submitted by Lp256 32 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models · 11 authors 3
Submitted by jonkahana 31 Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights · 4 authors 2
Submitted by voidism 31 SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models · 9 authors 2
Submitted by akhaliq 30 An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging · 4 authors 4
Submitted by Neph0s 28 CoSER: Coordinating LLM-Based Persona Simulation of Established Roles · 12 authors 2
Submitted by CaraJ 27 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency · 14 authors 2
Submitted by ZiyuG 26 Exploring the Potential of Encoder-free Architectures in 3D LMMs · 11 authors 2
Submitted by danf 16 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models · 4 authors 2
Submitted by Haon-Chen 13 mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data · 7 authors 2
Submitted by xymeow7 12 DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References · 5 authors 2
Submitted by guactastesgood 11 Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical Ranges · 3 authors 2
Submitted by BestWishYsh 7 VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer · 7 authors 2
Submitted by enquan2022 6 3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly · 7 authors 2