x1saint commited on
Commit
a460a54
·
verified ·
1 Parent(s): 1e6eaee

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,445 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - tr
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:482091
10
+ - loss:MultipleNegativesRankingLoss
11
+ base_model: Supabase/gte-small
12
+ widget:
13
+ - source_sentence: Ya da dışarı çıkıp yürü ya da biraz koşun. Bunu düzenli olarak
14
+ yapmıyorum ama Washington bunu yapmak için harika bir yer.
15
+ sentences:
16
+ - “Washington's yürüyüş ya da koşu için harika bir yer.”
17
+ - H-2A uzaylılar Amerika Birleşik Devletleri'nde zaman kısa süreleri var.
18
+ - “Washington'da düzenli olarak yürüyüşe ya da koşuya çıkıyorum.”
19
+ - source_sentence: Orta yaylalar ve güney kıyıları arasındaki kontrast daha belirgin
20
+ olamazdı.
21
+ sentences:
22
+ - İşitme Yardımı Uyumluluğu Müzakere Kuralları Komitesi, Federal İletişim Komisyonu'nun
23
+ bir ürünüdür.
24
+ - Dağlık ve sahil arasındaki kontrast kolayca işaretlendi.
25
+ - Kontrast işaretlenemedi.
26
+ - source_sentence: Bir 1997 Henry J. Kaiser Aile Vakfı anket yönetilen bakım planlarında
27
+ Amerikalılar temelde kendi bakımı ile memnun olduğunu bulundu.
28
+ sentences:
29
+ - Kaplanları takip ederken çok sessiz olmalısın.
30
+ - Henry Kaiser vakfı insanların sağlık hizmetlerinden hoşlandığını gösteriyor.
31
+ - Henry Kaiser Vakfı insanların sağlık hizmetlerinden nefret ettiğini gösteriyor.
32
+ - source_sentence: Eminim yapmışlardır.
33
+ sentences:
34
+ - Eminim öyle yapmışlardır.
35
+ - Batı Teksas'ta 100 10 dereceydi.
36
+ - Eminim yapmamışlardır.
37
+ - source_sentence: Ve gerçekten, baba haklıydı, oğlu zaten her şeyi tecrübe etmişti,
38
+ her şeyi denedi ve daha az ilgileniyordu.
39
+ sentences:
40
+ - Oğlu her şeye olan ilgisini kaybediyordu.
41
+ - Pek bir şey yapmadım.
42
+ - Baba oğlunun tecrübe için hala çok şey olduğunu biliyordu.
43
+ datasets:
44
+ - emrecan/all-nli-tr
45
+ pipeline_tag: sentence-similarity
46
+ library_name: sentence-transformers
47
+ metrics:
48
+ - cosine_accuracy
49
+ model-index:
50
+ - name: SentenceTransformer based on Supabase/gte-small
51
+ results:
52
+ - task:
53
+ type: triplet
54
+ name: Triplet
55
+ dataset:
56
+ name: all nli dev
57
+ type: all-nli-dev
58
+ metrics:
59
+ - type: cosine_accuracy
60
+ value: 0.8551850318908691
61
+ name: Cosine Accuracy
62
+ ---
63
+
64
+ # SentenceTransformer based on Supabase/gte-small
65
+
66
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Supabase/gte-small](https://huggingface.co/Supabase/gte-small) on the [all-nli-tr](https://huggingface.co/datasets/emrecan/all-nli-tr) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
67
+
68
+ ## Model Details
69
+
70
+ ### Model Description
71
+ - **Model Type:** Sentence Transformer
72
+ - **Base model:** [Supabase/gte-small](https://huggingface.co/Supabase/gte-small) <!-- at revision 93b36ff09519291b77d6000d2e86bd8565378086 -->
73
+ - **Maximum Sequence Length:** 512 tokens
74
+ - **Output Dimensionality:** 384 dimensions
75
+ - **Similarity Function:** Cosine Similarity
76
+ - **Training Dataset:**
77
+ - [all-nli-tr](https://huggingface.co/datasets/emrecan/all-nli-tr)
78
+ - **Language:** tr
79
+ <!-- - **License:** Unknown -->
80
+
81
+ ### Model Sources
82
+
83
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
84
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
85
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
86
+
87
+ ### Full Model Architecture
88
+
89
+ ```
90
+ SentenceTransformer(
91
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
92
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
93
+ )
94
+ ```
95
+
96
+ ## Usage
97
+
98
+ ### Direct Usage (Sentence Transformers)
99
+
100
+ First install the Sentence Transformers library:
101
+
102
+ ```bash
103
+ pip install -U sentence-transformers
104
+ ```
105
+
106
+ Then you can load this model and run inference.
107
+ ```python
108
+ from sentence_transformers import SentenceTransformer
109
+
110
+ # Download from the 🤗 Hub
111
+ model = SentenceTransformer("x1saint/gte-small-triplet-tr")
112
+ # Run inference
113
+ sentences = [
114
+ 'Ve gerçekten, baba haklıydı, oğlu zaten her şeyi tecrübe etmişti, her şeyi denedi ve daha az ilgileniyordu.',
115
+ 'Oğlu her şeye olan ilgisini kaybediyordu.',
116
+ 'Baba oğlunun tecrübe için hala çok şey olduğunu biliyordu.',
117
+ ]
118
+ embeddings = model.encode(sentences)
119
+ print(embeddings.shape)
120
+ # [3, 384]
121
+
122
+ # Get the similarity scores for the embeddings
123
+ similarities = model.similarity(embeddings, embeddings)
124
+ print(similarities.shape)
125
+ # [3, 3]
126
+ ```
127
+
128
+ <!--
129
+ ### Direct Usage (Transformers)
130
+
131
+ <details><summary>Click to see the direct usage in Transformers</summary>
132
+
133
+ </details>
134
+ -->
135
+
136
+ <!--
137
+ ### Downstream Usage (Sentence Transformers)
138
+
139
+ You can finetune this model on your own dataset.
140
+
141
+ <details><summary>Click to expand</summary>
142
+
143
+ </details>
144
+ -->
145
+
146
+ <!--
147
+ ### Out-of-Scope Use
148
+
149
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
150
+ -->
151
+
152
+ ## Evaluation
153
+
154
+ ### Metrics
155
+
156
+ #### Triplet
157
+
158
+ * Dataset: `all-nli-dev`
159
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
160
+
161
+ | Metric | Value |
162
+ |:--------------------|:-----------|
163
+ | **cosine_accuracy** | **0.8552** |
164
+
165
+ <!--
166
+ ## Bias, Risks and Limitations
167
+
168
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
169
+ -->
170
+
171
+ <!--
172
+ ### Recommendations
173
+
174
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
175
+ -->
176
+
177
+ ## Training Details
178
+
179
+ ### Training Dataset
180
+
181
+ #### all-nli-tr
182
+
183
+ * Dataset: [all-nli-tr](https://huggingface.co/datasets/emrecan/all-nli-tr) at [daeabfb](https://huggingface.co/datasets/emrecan/all-nli-tr/tree/daeabfbc01f82757ab998bd23ce0ddfceaa5e24d)
184
+ * Size: 482,091 training samples
185
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
186
+ * Approximate statistics based on the first 1000 samples:
187
+ | | anchor | positive | negative |
188
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
189
+ | type | string | string | string |
190
+ | details | <ul><li>min: 6 tokens</li><li>mean: 47.48 tokens</li><li>max: 301 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 25.16 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 23.67 tokens</li><li>max: 90 tokens</li></ul> |
191
+ * Samples:
192
+ | anchor | positive | negative |
193
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------|
194
+ | <code>Mevsim boyunca ve sanırım senin seviyendeyken onları bir sonraki seviyeye düşürürsün. Eğer ebeveyn takımını çağırmaya karar verirlerse Braves üçlü A'dan birini çağırmaya karar verirlerse çifte bir adam onun yerine geçmeye gider ve bekar bir adam gelir.</code> | <code>Eğer insanlar hatırlarsa, bir sonraki seviyeye düşersin.</code> | <code>Hiçbir şeyi hatırlamazlar.</code> |
195
+ | <code>Numaramızdan biri talimatlarınızı birazdan yerine getirecektir.</code> | <code>Ekibimin bir üyesi emirlerinizi büyük bir hassasiyetle yerine getirecektir.</code> | <code>Şu anda boş kimsek yok, bu yüzden sen de harekete geçmelisin.</code> |
196
+ | <code>Bunu nereden biliyorsun? Bütün bunlar yine onların bilgileri.</code> | <code>Bu bilgi onlara ait.</code> | <code>Hiçbir bilgileri yok.</code> |
197
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
198
+ ```json
199
+ {
200
+ "scale": 20.0,
201
+ "similarity_fct": "cos_sim"
202
+ }
203
+ ```
204
+
205
+ ### Evaluation Dataset
206
+
207
+ #### all-nli-tr
208
+
209
+ * Dataset: [all-nli-tr](https://huggingface.co/datasets/emrecan/all-nli-tr) at [daeabfb](https://huggingface.co/datasets/emrecan/all-nli-tr/tree/daeabfbc01f82757ab998bd23ce0ddfceaa5e24d)
210
+ * Size: 6,567 evaluation samples
211
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
212
+ * Approximate statistics based on the first 1000 samples:
213
+ | | anchor | positive | negative |
214
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
215
+ | type | string | string | string |
216
+ | details | <ul><li>min: 4 tokens</li><li>mean: 45.12 tokens</li><li>max: 201 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 25.11 tokens</li><li>max: 98 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 23.81 tokens</li><li>max: 64 tokens</li></ul> |
217
+ * Samples:
218
+ | anchor | positive | negative |
219
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------|
220
+ | <code>Bilemiyorum. Onunla ilgili karışık duygularım var. Bazen ondan hoşlanıyorum ama aynı zamanda birisinin onu dövmesini görmeyi seviyorum.</code> | <code>Çoğunlukla ondan hoşlanıyorum, ama yine de birinin onu dövdüğünü görmekten zevk alıyorum.</code> | <code>O benim favorim ve kimsenin onu yendiğini görmek istemiyorum.</code> |
221
+ | <code>Sen ve arkadaşların burada hoş karşılanmaz, Severn söyledi.</code> | <code>Severn orada insanların hoş karşılanmadığını söyledi.</code> | <code>Severn orada insanların her zaman hoş karşılanacağını söyledi.</code> |
222
+ | <code>Gecenin en aşağısı ne olduğundan emin değilim.</code> | <code>Dün gece ne kadar soğuk oldu bilmiyorum.</code> | <code>Dün gece hava 37 dereceydi.</code> |
223
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
224
+ ```json
225
+ {
226
+ "scale": 20.0,
227
+ "similarity_fct": "cos_sim"
228
+ }
229
+ ```
230
+
231
+ ### Training Hyperparameters
232
+ #### Non-Default Hyperparameters
233
+
234
+ - `eval_strategy`: steps
235
+ - `per_device_train_batch_size`: 32
236
+ - `per_device_eval_batch_size`: 64
237
+ - `gradient_accumulation_steps`: 4
238
+ - `learning_rate`: 1e-05
239
+ - `warmup_ratio`: 0.1
240
+ - `bf16`: True
241
+ - `dataloader_num_workers`: 4
242
+
243
+ #### All Hyperparameters
244
+ <details><summary>Click to expand</summary>
245
+
246
+ - `overwrite_output_dir`: False
247
+ - `do_predict`: False
248
+ - `eval_strategy`: steps
249
+ - `prediction_loss_only`: True
250
+ - `per_device_train_batch_size`: 32
251
+ - `per_device_eval_batch_size`: 64
252
+ - `per_gpu_train_batch_size`: None
253
+ - `per_gpu_eval_batch_size`: None
254
+ - `gradient_accumulation_steps`: 4
255
+ - `eval_accumulation_steps`: None
256
+ - `torch_empty_cache_steps`: None
257
+ - `learning_rate`: 1e-05
258
+ - `weight_decay`: 0.0
259
+ - `adam_beta1`: 0.9
260
+ - `adam_beta2`: 0.999
261
+ - `adam_epsilon`: 1e-08
262
+ - `max_grad_norm`: 1.0
263
+ - `num_train_epochs`: 3
264
+ - `max_steps`: -1
265
+ - `lr_scheduler_type`: linear
266
+ - `lr_scheduler_kwargs`: {}
267
+ - `warmup_ratio`: 0.1
268
+ - `warmup_steps`: 0
269
+ - `log_level`: passive
270
+ - `log_level_replica`: warning
271
+ - `log_on_each_node`: True
272
+ - `logging_nan_inf_filter`: True
273
+ - `save_safetensors`: True
274
+ - `save_on_each_node`: False
275
+ - `save_only_model`: False
276
+ - `restore_callback_states_from_checkpoint`: False
277
+ - `no_cuda`: False
278
+ - `use_cpu`: False
279
+ - `use_mps_device`: False
280
+ - `seed`: 42
281
+ - `data_seed`: None
282
+ - `jit_mode_eval`: False
283
+ - `use_ipex`: False
284
+ - `bf16`: True
285
+ - `fp16`: False
286
+ - `fp16_opt_level`: O1
287
+ - `half_precision_backend`: auto
288
+ - `bf16_full_eval`: False
289
+ - `fp16_full_eval`: False
290
+ - `tf32`: None
291
+ - `local_rank`: 0
292
+ - `ddp_backend`: None
293
+ - `tpu_num_cores`: None
294
+ - `tpu_metrics_debug`: False
295
+ - `debug`: []
296
+ - `dataloader_drop_last`: False
297
+ - `dataloader_num_workers`: 4
298
+ - `dataloader_prefetch_factor`: None
299
+ - `past_index`: -1
300
+ - `disable_tqdm`: False
301
+ - `remove_unused_columns`: True
302
+ - `label_names`: None
303
+ - `load_best_model_at_end`: False
304
+ - `ignore_data_skip`: False
305
+ - `fsdp`: []
306
+ - `fsdp_min_num_params`: 0
307
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
308
+ - `fsdp_transformer_layer_cls_to_wrap`: None
309
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
310
+ - `deepspeed`: None
311
+ - `label_smoothing_factor`: 0.0
312
+ - `optim`: adamw_torch
313
+ - `optim_args`: None
314
+ - `adafactor`: False
315
+ - `group_by_length`: False
316
+ - `length_column_name`: length
317
+ - `ddp_find_unused_parameters`: None
318
+ - `ddp_bucket_cap_mb`: None
319
+ - `ddp_broadcast_buffers`: False
320
+ - `dataloader_pin_memory`: True
321
+ - `dataloader_persistent_workers`: False
322
+ - `skip_memory_metrics`: True
323
+ - `use_legacy_prediction_loop`: False
324
+ - `push_to_hub`: False
325
+ - `resume_from_checkpoint`: None
326
+ - `hub_model_id`: None
327
+ - `hub_strategy`: every_save
328
+ - `hub_private_repo`: None
329
+ - `hub_always_push`: False
330
+ - `gradient_checkpointing`: False
331
+ - `gradient_checkpointing_kwargs`: None
332
+ - `include_inputs_for_metrics`: False
333
+ - `include_for_metrics`: []
334
+ - `eval_do_concat_batches`: True
335
+ - `fp16_backend`: auto
336
+ - `push_to_hub_model_id`: None
337
+ - `push_to_hub_organization`: None
338
+ - `mp_parameters`:
339
+ - `auto_find_batch_size`: False
340
+ - `full_determinism`: False
341
+ - `torchdynamo`: None
342
+ - `ray_scope`: last
343
+ - `ddp_timeout`: 1800
344
+ - `torch_compile`: False
345
+ - `torch_compile_backend`: None
346
+ - `torch_compile_mode`: None
347
+ - `dispatch_batches`: None
348
+ - `split_batches`: None
349
+ - `include_tokens_per_second`: False
350
+ - `include_num_input_tokens_seen`: False
351
+ - `neftune_noise_alpha`: None
352
+ - `optim_target_modules`: None
353
+ - `batch_eval_metrics`: False
354
+ - `eval_on_start`: False
355
+ - `use_liger_kernel`: False
356
+ - `eval_use_gather_object`: False
357
+ - `average_tokens_across_devices`: False
358
+ - `prompts`: None
359
+ - `batch_sampler`: batch_sampler
360
+ - `multi_dataset_batch_sampler`: proportional
361
+
362
+ </details>
363
+
364
+ ### Training Logs
365
+ | Epoch | Step | Training Loss | Validation Loss | all-nli-dev_cosine_accuracy |
366
+ |:------:|:-----:|:-------------:|:---------------:|:---------------------------:|
367
+ | 0.1327 | 500 | 9.1341 | 1.4261 | 0.7835 |
368
+ | 0.2655 | 1000 | 5.2529 | 1.2543 | 0.7967 |
369
+ | 0.3982 | 1500 | 4.5877 | 1.1583 | 0.8119 |
370
+ | 0.5310 | 2000 | 4.229 | 1.0974 | 0.8171 |
371
+ | 0.6637 | 2500 | 4.0158 | 1.0592 | 0.8238 |
372
+ | 0.7965 | 3000 | 3.7869 | 1.0161 | 0.8310 |
373
+ | 0.9292 | 3500 | 3.6862 | 0.9897 | 0.8372 |
374
+ | 1.0619 | 4000 | 3.5519 | 0.9751 | 0.8406 |
375
+ | 1.1946 | 4500 | 3.3986 | 0.9596 | 0.8421 |
376
+ | 1.3274 | 5000 | 3.3479 | 0.9377 | 0.8435 |
377
+ | 1.4601 | 5500 | 3.3104 | 0.9296 | 0.8465 |
378
+ | 1.5929 | 6000 | 3.2255 | 0.9178 | 0.8467 |
379
+ | 1.7256 | 6500 | 3.1998 | 0.9077 | 0.8514 |
380
+ | 1.8584 | 7000 | 3.1491 | 0.9017 | 0.8496 |
381
+ | 1.9911 | 7500 | 3.1337 | 0.8955 | 0.8511 |
382
+ | 2.1237 | 8000 | 3.052 | 0.8885 | 0.8526 |
383
+ | 2.2565 | 8500 | 2.9998 | 0.8836 | 0.8524 |
384
+ | 2.3892 | 9000 | 2.9835 | 0.8794 | 0.8517 |
385
+ | 2.5220 | 9500 | 2.9941 | 0.8778 | 0.8532 |
386
+ | 2.6547 | 10000 | 2.9704 | 0.8744 | 0.8555 |
387
+ | 2.7875 | 10500 | 2.9731 | 0.8723 | 0.8541 |
388
+ | 2.9202 | 11000 | 2.9221 | 0.8717 | 0.8552 |
389
+
390
+
391
+ ### Framework Versions
392
+ - Python: 3.11.11
393
+ - Sentence Transformers: 3.4.1
394
+ - Transformers: 4.48.3
395
+ - PyTorch: 2.5.1+cu124
396
+ - Accelerate: 1.3.0
397
+ - Datasets: 3.3.0
398
+ - Tokenizers: 0.21.0
399
+
400
+ ## Citation
401
+
402
+ ### BibTeX
403
+
404
+ #### Sentence Transformers
405
+ ```bibtex
406
+ @inproceedings{reimers-2019-sentence-bert,
407
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
408
+ author = "Reimers, Nils and Gurevych, Iryna",
409
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
410
+ month = "11",
411
+ year = "2019",
412
+ publisher = "Association for Computational Linguistics",
413
+ url = "https://arxiv.org/abs/1908.10084",
414
+ }
415
+ ```
416
+
417
+ #### MultipleNegativesRankingLoss
418
+ ```bibtex
419
+ @misc{henderson2017efficient,
420
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
421
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
422
+ year={2017},
423
+ eprint={1705.00652},
424
+ archivePrefix={arXiv},
425
+ primaryClass={cs.CL}
426
+ }
427
+ ```
428
+
429
+ <!--
430
+ ## Glossary
431
+
432
+ *Clearly define terms in order to be accessible across audiences.*
433
+ -->
434
+
435
+ <!--
436
+ ## Model Card Authors
437
+
438
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
439
+ -->
440
+
441
+ <!--
442
+ ## Model Card Contact
443
+
444
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
445
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Supabase/gte-small",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.48.3",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5e6e112700d2041cd33132a793e12319b775770793fb036129e9ff07f5a6bef
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff