Zero-Shot Image Classification
Transformers
Safetensors
siglip
vision
Inference Endpoints
ariG23498 HF staff commited on
Commit
3f9f96c
·
verified ·
1 Parent(s): 80579d9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -7
README.md CHANGED
@@ -2,15 +2,20 @@
2
  license: apache-2.0
3
  tags:
4
  - vision
 
 
 
 
 
 
 
5
  ---
6
 
7
  # SigLIP 2 Base
8
 
9
- [SigLIP 2](https://huggingface.co/collections/google/siglip2-67b5dcef38c175486e240107)
10
- extends the pretraining objective of
11
- [SigLIP](https://huggingface.co/collections/google/siglip-659d5e62f0ae1a57ae0e83ba)
12
- with prior, independently developed techniques into a unified recipe, for improved semantic
13
- understanding, localization, and dense features.
14
 
15
  ## Intended uses
16
 
@@ -80,10 +85,18 @@ The model was trained on up to 2048 TPU-v5e chips.
80
 
81
  Evaluation of SigLIP 2 is shown below (taken from the paper).
82
 
83
- [Evaluation Table](TODO)
84
 
85
  ### BibTeX entry and citation info
86
 
87
  ```bibtex
88
- TODO
 
 
 
 
 
 
 
 
89
  ```
 
2
  license: apache-2.0
3
  tags:
4
  - vision
5
+ widget:
6
+ - src: >-
7
+ https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg
8
+ candidate_labels: bee in the sky, bee on the flower
9
+ example_title: Bee
10
+ library_name: transformers
11
+ pipeline_tag: zero-shot-image-classification
12
  ---
13
 
14
  # SigLIP 2 Base
15
 
16
+ [SigLIP 2](https://huggingface.co/papers/2502.14786) extends the pretraining objective of
17
+ [SigLIP](https://huggingface.co/papers/2303.15343) with prior, independently developed techniques
18
+ into a unified recipe, for improved semantic understanding, localization, and dense features.
 
 
19
 
20
  ## Intended uses
21
 
 
85
 
86
  Evaluation of SigLIP 2 is shown below (taken from the paper).
87
 
88
+ ![Evaluation Table](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/sg2-blog/eval_table.png)
89
 
90
  ### BibTeX entry and citation info
91
 
92
  ```bibtex
93
+ @misc{tschannen2025siglip2multilingualvisionlanguage,
94
+ title={SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features},
95
+ author={Michael Tschannen and Alexey Gritsenko and Xiao Wang and Muhammad Ferjad Naeem and Ibrahim Alabdulmohsin and Nikhil Parthasarathy and Talfan Evans and Lucas Beyer and Ye Xia and Basil Mustafa and Olivier Hénaff and Jeremiah Harmsen and Andreas Steiner and Xiaohua Zhai},
96
+ year={2025},
97
+ eprint={2502.14786},
98
+ archivePrefix={arXiv},
99
+ primaryClass={cs.CV},
100
+ url={https://arxiv.org/abs/2502.14786},
101
+ }
102
  ```