-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2502.11573
-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 7 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 22 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 8 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 37
-
The Evolution of Multimodal Model Architectures
Paper • 2405.17927 • Published • 1 -
What matters when building vision-language models?
Paper • 2405.02246 • Published • 102 -
Efficient Architectures for High Resolution Vision-Language Models
Paper • 2501.02584 • Published -
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 125
-
Exploring the Potential of Encoder-free Architectures in 3D LMMs
Paper • 2502.09620 • Published • 25 -
The Evolution of Multimodal Model Architectures
Paper • 2405.17927 • Published • 1 -
What matters when building vision-language models?
Paper • 2405.02246 • Published • 102 -
Efficient Architectures for High Resolution Vision-Language Models
Paper • 2501.02584 • Published
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 33 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 26 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 123 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22