SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper
•
2502.14786
•
Published
•
97
OpenCLIP & timm weights are up now, https://huggingface.co/collections/timm/siglip-2-67b8e72ba08b09dd97aecaf9