SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 100 • 6
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 10 days ago • 38
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Paper • 2406.14544 • Published Jun 20, 2024 • 35
Open-Endedness is Essential for Artificial Superhuman Intelligence Paper • 2406.04268 • Published Jun 6, 2024 • 12
Running 5 5 Segmentation Features 🐠 Process images for edges, dimensionality reduction, segmentation, or one-click segmentation
Running 5 5 Segmentation Features 🐠 Process images for edges, dimensionality reduction, segmentation, or one-click segmentation