Submitted by LXT 52 OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding · 8 authors 10
Submitted by xinlai 42 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs · 6 authors 2
Submitted by multimodalart 34 MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data · 2 authors 3
Submitted by TranSirius 30 SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation · 8 authors 1
Submitted by TranSirius 25 Aligning Teacher with Student Preferences for Tailored Training Data Generation · 6 authors 2
Submitted by davanstrien 23 LiveBench: A Challenging, Contamination-Free LLM Benchmark · 15 authors 3
Submitted by Foxfi 15 MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression · 13 authors 4
Submitted by akhaliq 11 AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models · 12 authors 4
Submitted by xw-eric 10 Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding · 9 authors 2
Submitted by mbrack 9 T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings · 5 authors 5
Submitted by dongguanting 6 Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation · 6 authors 5
Submitted by ahmedheakl 5 ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs · 5 authors 5
Submitted by ahmedheakl 3 ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models · 5 authors 3