SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 3 days ago • 88
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper • 2410.13639 • Published Oct 17, 2024 • 17
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association Paper • 2407.17379 • Published Jul 24, 2024 • 2
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval Paper • 2401.13478 • Published Jan 24, 2024 • 1
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval Paper • 2401.13478 • Published Jan 24, 2024 • 1