-
3
Multimodal Clembench
🏆Explore and compare models on a leaderboard and plots
-
81
SEED-Bench Leaderboard
🏆 -
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 35 -
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Paper • 2409.02813 • Published • 29
Collections
Discover the best community collections!
Collections including paper arxiv:2408.03361
-
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Paper • 2408.04594 • Published • 15 -
Achieving Human Level Competitive Robot Table Tennis
Paper • 2408.03906 • Published • 27 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 86 -
Heavy Labels Out! Dataset Distillation with Label Space Lightening
Paper • 2408.08201 • Published • 19
-
A Comparative Study on Automatic Coding of Medical Letters with Explainability
Paper • 2407.13638 • Published • 5 -
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Paper • 2407.07061 • Published • 27 -
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper • 2407.03502 • Published • 50 -
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Paper • 2407.06723 • Published • 11
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 131 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 37
-
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 8 -
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Paper • 2406.08407 • Published • 27 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 86
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 192 -
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 35 -
BLINK: Multimodal Large Language Models Can See but Not Perceive
Paper • 2404.12390 • Published • 26 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 35