-
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 114 -
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices
Paper • 2411.10640 • Published • 45 -
Knowledge Transfer Across Modalities with Natural Language Supervision
Paper • 2411.15611 • Published • 17 -
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability
Paper • 2411.19943 • Published • 58
Zhou
HaoUSF
·
AI & ML interests
None yet
Organizations
Collections
1
models
None public yet
datasets
None public yet