ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 10 days ago • 38 • 5
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published Nov 7, 2024 • 22 • 3
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published Nov 7, 2024 • 22 • 3
GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models Paper • 2408.11817 • Published Aug 21, 2024 • 9 • 2