tomaarsen HF Staff commited on
Commit
ccd4939
·
verified ·
1 Parent(s): c059494

Add new SentenceTransformer model

Browse files
README.md ADDED
@@ -0,0 +1,1021 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - dense
10
+ - generated_from_trainer
11
+ - dataset_size:3002496
12
+ - loss:MatryoshkaLoss
13
+ - loss:MultipleNegativesRankingLoss
14
+ widget:
15
+ - source_sentence: how to sign legal documents as power of attorney?
16
+ sentences:
17
+ - 'After the principal''s name, write “by” and then sign your own name. Under or
18
+ after the signature line, indicate your status as POA by including any of the
19
+ following identifiers: as POA, as Agent, as Attorney in Fact or as Power of Attorney.'
20
+ - '[''From the Home screen, swipe left to Apps.'', ''Tap Transfer my Data.'', ''Tap
21
+ Menu (...).'', ''Tap Export to SD card.'']'
22
+ - Ginger Dank Nugs (Grape) - 350mg. Feast your eyes on these unique and striking
23
+ gourmet chocolates; Coco Nugs created by Ginger Dank. Crafted to resemble perfect
24
+ nugs of cannabis, each of the 10 buds contains 35mg of THC. ... This is a perfect
25
+ product for both cannabis and chocolate lovers, who appreciate a little twist.
26
+ - source_sentence: how to delete vdom in fortigate?
27
+ sentences:
28
+ - Go to System -> VDOM -> VDOM2 and select 'Delete'. This VDOM is now successfully
29
+ removed from the configuration.
30
+ - 'Both combination birth control pills and progestin-only pills may cause headaches
31
+ as a side effect. Additional side effects of birth control pills may include:
32
+ breast tenderness. nausea.'
33
+ - White cheese tends to show imperfections more readily and as consumers got more
34
+ used to yellow-orange cheese, it became an expected option. Today, many cheddars
35
+ are yellow. While most cheesemakers use annatto, some use an artificial coloring
36
+ agent instead, according to Sachs.
37
+ - source_sentence: where are earthquakes most likely to occur on earth?
38
+ sentences:
39
+ - Zelle in the Bank of the America app is a fast, safe, and easy way to send and
40
+ receive money with family and friends who have a bank account in the U.S., all
41
+ with no fees. Money moves in minutes directly between accounts that are already
42
+ enrolled with Zelle.
43
+ - It takes about 3 days for a spacecraft to reach the Moon. During that time a spacecraft
44
+ travels at least 240,000 miles (386,400 kilometers) which is the distance between
45
+ Earth and the Moon.
46
+ - Most earthquakes occur along the edge of the oceanic and continental plates. The
47
+ earth's crust (the outer layer of the planet) is made up of several pieces, called
48
+ plates. The plates under the oceans are called oceanic plates and the rest are
49
+ continental plates.
50
+ - source_sentence: fix iphone is disabled connect to itunes without itunes?
51
+ sentences:
52
+ - To fix a disabled iPhone or iPad without iTunes, you have to erase your device.
53
+ Click on the "Erase iPhone" option and confirm your selection. Wait for a while
54
+ as the "Find My iPhone" feature will remotely erase your iOS device. Needless
55
+ to say, it will also disable its lock.
56
+ - How Māui brought fire to the world. One evening, after eating a hearty meal, Māui
57
+ lay beside his fire staring into the flames. ... In the middle of the night, while
58
+ everyone was sleeping, Māui went from village to village and extinguished all
59
+ the fires until not a single fire burned in the world.
60
+ - Angry Orchard makes a variety of year-round craft cider styles, including Angry
61
+ Orchard Crisp Apple, a fruit-forward hard cider that balances the sweetness of
62
+ culinary apples with dryness and bright acidity of bittersweet apples for a complex,
63
+ refreshing taste.
64
+ - source_sentence: how to reverse a video on tiktok that's not yours?
65
+ sentences:
66
+ - '[''Tap "Effects" at the bottom of your screen — it\''s an icon that looks like
67
+ a clock. Open the Effects menu. ... '', ''At the end of the new list that appears,
68
+ tap "Time." Select "Time" at the end. ... '', ''Select "Reverse" — you\''ll then
69
+ see a preview of your new, reversed video appear on the screen.'']'
70
+ - Franchise Facts Poke Bar has a franchise fee of up to $30,000, with a total initial
71
+ investment range of $157,800 to $438,000. The initial cost of a franchise includes
72
+ several fees -- Unlock this franchise to better understand the costs such as training
73
+ and territory fees.
74
+ - Relative age is the age of a rock layer (or the fossils it contains) compared
75
+ to other layers. It can be determined by looking at the position of rock layers.
76
+ Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can
77
+ be determined by using radiometric dating.
78
+ datasets:
79
+ - sentence-transformers/gooaq
80
+ pipeline_tag: sentence-similarity
81
+ library_name: sentence-transformers
82
+ metrics:
83
+ - cosine_accuracy@1
84
+ - cosine_accuracy@3
85
+ - cosine_accuracy@5
86
+ - cosine_accuracy@10
87
+ - cosine_precision@1
88
+ - cosine_precision@3
89
+ - cosine_precision@5
90
+ - cosine_precision@10
91
+ - cosine_recall@1
92
+ - cosine_recall@3
93
+ - cosine_recall@5
94
+ - cosine_recall@10
95
+ - cosine_ndcg@10
96
+ - cosine_mrr@10
97
+ - cosine_map@100
98
+ co2_eq_emissions:
99
+ emissions: 7.447488216858034
100
+ energy_consumed: 0.019159891682723612
101
+ source: codecarbon
102
+ training_type: fine-tuning
103
+ on_cloud: false
104
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
105
+ ram_total_size: 31.777088165283203
106
+ hours_used: 0.124
107
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
108
+ model-index:
109
+ - name: Static Embeddings with BEE-spoke-data/wordpiece-tokenizer-32k-en_code-msp
110
+ tokenizer finetuned on GooAQ pairs
111
+ results:
112
+ - task:
113
+ type: information-retrieval
114
+ name: Information Retrieval
115
+ dataset:
116
+ name: gooaq 1024 dev
117
+ type: gooaq-1024-dev
118
+ metrics:
119
+ - type: cosine_accuracy@1
120
+ value: 0.6335
121
+ name: Cosine Accuracy@1
122
+ - type: cosine_accuracy@3
123
+ value: 0.8394
124
+ name: Cosine Accuracy@3
125
+ - type: cosine_accuracy@5
126
+ value: 0.8979
127
+ name: Cosine Accuracy@5
128
+ - type: cosine_accuracy@10
129
+ value: 0.9454
130
+ name: Cosine Accuracy@10
131
+ - type: cosine_precision@1
132
+ value: 0.6335
133
+ name: Cosine Precision@1
134
+ - type: cosine_precision@3
135
+ value: 0.27979999999999994
136
+ name: Cosine Precision@3
137
+ - type: cosine_precision@5
138
+ value: 0.17958000000000002
139
+ name: Cosine Precision@5
140
+ - type: cosine_precision@10
141
+ value: 0.09454000000000003
142
+ name: Cosine Precision@10
143
+ - type: cosine_recall@1
144
+ value: 0.6335
145
+ name: Cosine Recall@1
146
+ - type: cosine_recall@3
147
+ value: 0.8394
148
+ name: Cosine Recall@3
149
+ - type: cosine_recall@5
150
+ value: 0.8979
151
+ name: Cosine Recall@5
152
+ - type: cosine_recall@10
153
+ value: 0.9454
154
+ name: Cosine Recall@10
155
+ - type: cosine_ndcg@10
156
+ value: 0.7948890776997601
157
+ name: Cosine Ndcg@10
158
+ - type: cosine_mrr@10
159
+ value: 0.7459194047618989
160
+ name: Cosine Mrr@10
161
+ - type: cosine_map@100
162
+ value: 0.7484214498572738
163
+ name: Cosine Map@100
164
+ - task:
165
+ type: information-retrieval
166
+ name: Information Retrieval
167
+ dataset:
168
+ name: gooaq 512 dev
169
+ type: gooaq-512-dev
170
+ metrics:
171
+ - type: cosine_accuracy@1
172
+ value: 0.6285
173
+ name: Cosine Accuracy@1
174
+ - type: cosine_accuracy@3
175
+ value: 0.8339
176
+ name: Cosine Accuracy@3
177
+ - type: cosine_accuracy@5
178
+ value: 0.8943
179
+ name: Cosine Accuracy@5
180
+ - type: cosine_accuracy@10
181
+ value: 0.9425
182
+ name: Cosine Accuracy@10
183
+ - type: cosine_precision@1
184
+ value: 0.6285
185
+ name: Cosine Precision@1
186
+ - type: cosine_precision@3
187
+ value: 0.2779666666666666
188
+ name: Cosine Precision@3
189
+ - type: cosine_precision@5
190
+ value: 0.17886000000000002
191
+ name: Cosine Precision@5
192
+ - type: cosine_precision@10
193
+ value: 0.09425000000000003
194
+ name: Cosine Precision@10
195
+ - type: cosine_recall@1
196
+ value: 0.6285
197
+ name: Cosine Recall@1
198
+ - type: cosine_recall@3
199
+ value: 0.8339
200
+ name: Cosine Recall@3
201
+ - type: cosine_recall@5
202
+ value: 0.8943
203
+ name: Cosine Recall@5
204
+ - type: cosine_recall@10
205
+ value: 0.9425
206
+ name: Cosine Recall@10
207
+ - type: cosine_ndcg@10
208
+ value: 0.7907464684784297
209
+ name: Cosine Ndcg@10
210
+ - type: cosine_mrr@10
211
+ value: 0.7413761111111041
212
+ name: Cosine Mrr@10
213
+ - type: cosine_map@100
214
+ value: 0.7439975831469758
215
+ name: Cosine Map@100
216
+ - task:
217
+ type: information-retrieval
218
+ name: Information Retrieval
219
+ dataset:
220
+ name: gooaq 256 dev
221
+ type: gooaq-256-dev
222
+ metrics:
223
+ - type: cosine_accuracy@1
224
+ value: 0.6196
225
+ name: Cosine Accuracy@1
226
+ - type: cosine_accuracy@3
227
+ value: 0.8262
228
+ name: Cosine Accuracy@3
229
+ - type: cosine_accuracy@5
230
+ value: 0.888
231
+ name: Cosine Accuracy@5
232
+ - type: cosine_accuracy@10
233
+ value: 0.9375
234
+ name: Cosine Accuracy@10
235
+ - type: cosine_precision@1
236
+ value: 0.6196
237
+ name: Cosine Precision@1
238
+ - type: cosine_precision@3
239
+ value: 0.2754
240
+ name: Cosine Precision@3
241
+ - type: cosine_precision@5
242
+ value: 0.17760000000000004
243
+ name: Cosine Precision@5
244
+ - type: cosine_precision@10
245
+ value: 0.09375000000000001
246
+ name: Cosine Precision@10
247
+ - type: cosine_recall@1
248
+ value: 0.6196
249
+ name: Cosine Recall@1
250
+ - type: cosine_recall@3
251
+ value: 0.8262
252
+ name: Cosine Recall@3
253
+ - type: cosine_recall@5
254
+ value: 0.888
255
+ name: Cosine Recall@5
256
+ - type: cosine_recall@10
257
+ value: 0.9375
258
+ name: Cosine Recall@10
259
+ - type: cosine_ndcg@10
260
+ value: 0.7830118342115728
261
+ name: Cosine Ndcg@10
262
+ - type: cosine_mrr@10
263
+ value: 0.73284916666666
264
+ name: Cosine Mrr@10
265
+ - type: cosine_map@100
266
+ value: 0.7356198073355731
267
+ name: Cosine Map@100
268
+ - task:
269
+ type: information-retrieval
270
+ name: Information Retrieval
271
+ dataset:
272
+ name: gooaq 128 dev
273
+ type: gooaq-128-dev
274
+ metrics:
275
+ - type: cosine_accuracy@1
276
+ value: 0.597
277
+ name: Cosine Accuracy@1
278
+ - type: cosine_accuracy@3
279
+ value: 0.8033
280
+ name: Cosine Accuracy@3
281
+ - type: cosine_accuracy@5
282
+ value: 0.8681
283
+ name: Cosine Accuracy@5
284
+ - type: cosine_accuracy@10
285
+ value: 0.9247
286
+ name: Cosine Accuracy@10
287
+ - type: cosine_precision@1
288
+ value: 0.597
289
+ name: Cosine Precision@1
290
+ - type: cosine_precision@3
291
+ value: 0.26776666666666665
292
+ name: Cosine Precision@3
293
+ - type: cosine_precision@5
294
+ value: 0.17361999999999997
295
+ name: Cosine Precision@5
296
+ - type: cosine_precision@10
297
+ value: 0.09247000000000001
298
+ name: Cosine Precision@10
299
+ - type: cosine_recall@1
300
+ value: 0.597
301
+ name: Cosine Recall@1
302
+ - type: cosine_recall@3
303
+ value: 0.8033
304
+ name: Cosine Recall@3
305
+ - type: cosine_recall@5
306
+ value: 0.8681
307
+ name: Cosine Recall@5
308
+ - type: cosine_recall@10
309
+ value: 0.9247
310
+ name: Cosine Recall@10
311
+ - type: cosine_ndcg@10
312
+ value: 0.7633008182074578
313
+ name: Cosine Ndcg@10
314
+ - type: cosine_mrr@10
315
+ value: 0.7111824206349133
316
+ name: Cosine Mrr@10
317
+ - type: cosine_map@100
318
+ value: 0.714297170282837
319
+ name: Cosine Map@100
320
+ - task:
321
+ type: information-retrieval
322
+ name: Information Retrieval
323
+ dataset:
324
+ name: gooaq 64 dev
325
+ type: gooaq-64-dev
326
+ metrics:
327
+ - type: cosine_accuracy@1
328
+ value: 0.5541
329
+ name: Cosine Accuracy@1
330
+ - type: cosine_accuracy@3
331
+ value: 0.7568
332
+ name: Cosine Accuracy@3
333
+ - type: cosine_accuracy@5
334
+ value: 0.8287
335
+ name: Cosine Accuracy@5
336
+ - type: cosine_accuracy@10
337
+ value: 0.896
338
+ name: Cosine Accuracy@10
339
+ - type: cosine_precision@1
340
+ value: 0.5541
341
+ name: Cosine Precision@1
342
+ - type: cosine_precision@3
343
+ value: 0.25226666666666664
344
+ name: Cosine Precision@3
345
+ - type: cosine_precision@5
346
+ value: 0.16574
347
+ name: Cosine Precision@5
348
+ - type: cosine_precision@10
349
+ value: 0.08960000000000001
350
+ name: Cosine Precision@10
351
+ - type: cosine_recall@1
352
+ value: 0.5541
353
+ name: Cosine Recall@1
354
+ - type: cosine_recall@3
355
+ value: 0.7568
356
+ name: Cosine Recall@3
357
+ - type: cosine_recall@5
358
+ value: 0.8287
359
+ name: Cosine Recall@5
360
+ - type: cosine_recall@10
361
+ value: 0.896
362
+ name: Cosine Recall@10
363
+ - type: cosine_ndcg@10
364
+ value: 0.7246476170472534
365
+ name: Cosine Ndcg@10
366
+ - type: cosine_mrr@10
367
+ value: 0.6696768650793602
368
+ name: Cosine Mrr@10
369
+ - type: cosine_map@100
370
+ value: 0.6735610073887002
371
+ name: Cosine Map@100
372
+ - task:
373
+ type: information-retrieval
374
+ name: Information Retrieval
375
+ dataset:
376
+ name: gooaq 32 dev
377
+ type: gooaq-32-dev
378
+ metrics:
379
+ - type: cosine_accuracy@1
380
+ value: 0.4602
381
+ name: Cosine Accuracy@1
382
+ - type: cosine_accuracy@3
383
+ value: 0.6631
384
+ name: Cosine Accuracy@3
385
+ - type: cosine_accuracy@5
386
+ value: 0.7372
387
+ name: Cosine Accuracy@5
388
+ - type: cosine_accuracy@10
389
+ value: 0.8226
390
+ name: Cosine Accuracy@10
391
+ - type: cosine_precision@1
392
+ value: 0.4602
393
+ name: Cosine Precision@1
394
+ - type: cosine_precision@3
395
+ value: 0.22103333333333336
396
+ name: Cosine Precision@3
397
+ - type: cosine_precision@5
398
+ value: 0.14744000000000002
399
+ name: Cosine Precision@5
400
+ - type: cosine_precision@10
401
+ value: 0.08225999999999999
402
+ name: Cosine Precision@10
403
+ - type: cosine_recall@1
404
+ value: 0.4602
405
+ name: Cosine Recall@1
406
+ - type: cosine_recall@3
407
+ value: 0.6631
408
+ name: Cosine Recall@3
409
+ - type: cosine_recall@5
410
+ value: 0.7372
411
+ name: Cosine Recall@5
412
+ - type: cosine_recall@10
413
+ value: 0.8226
414
+ name: Cosine Recall@10
415
+ - type: cosine_ndcg@10
416
+ value: 0.6372411594771165
417
+ name: Cosine Ndcg@10
418
+ - type: cosine_mrr@10
419
+ value: 0.5783468650793636
420
+ name: Cosine Mrr@10
421
+ - type: cosine_map@100
422
+ value: 0.5841294309265819
423
+ name: Cosine Map@100
424
+ ---
425
+
426
+ # Static Embeddings with BEE-spoke-data/wordpiece-tokenizer-32k-en_code-msp tokenizer finetuned on GooAQ pairs
427
+
428
+ This is a [sentence-transformers](https://www.SBERT.net) model trained on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
429
+
430
+ ## Model Details
431
+
432
+ ### Model Description
433
+ - **Model Type:** Sentence Transformer
434
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
435
+ - **Maximum Sequence Length:** inf tokens
436
+ - **Output Dimensionality:** 1024 dimensions
437
+ - **Similarity Function:** Cosine Similarity
438
+ - **Training Dataset:**
439
+ - [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
440
+ - **Language:** en
441
+ - **License:** apache-2.0
442
+
443
+ ### Model Sources
444
+
445
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
446
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
447
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
448
+
449
+ ### Full Model Architecture
450
+
451
+ ```
452
+ SentenceTransformer(
453
+ (0): StaticEmbedding(
454
+ (embedding): EmbeddingBag(31999, 1024, mode='mean')
455
+ )
456
+ )
457
+ ```
458
+
459
+ ## Usage
460
+
461
+ ### Direct Usage (Sentence Transformers)
462
+
463
+ First install the Sentence Transformers library:
464
+
465
+ ```bash
466
+ pip install -U sentence-transformers
467
+ ```
468
+
469
+ Then you can load this model and run inference.
470
+ ```python
471
+ from sentence_transformers import SentenceTransformer
472
+
473
+ # Download from the 🤗 Hub
474
+ model = SentenceTransformer("tomaarsen/static-BEE-spoke-data-tokenizer-v2-gooaq")
475
+ # Run inference
476
+ sentences = [
477
+ "how to reverse a video on tiktok that's not yours?",
478
+ '[\'Tap "Effects" at the bottom of your screen — it\\\'s an icon that looks like a clock. Open the Effects menu. ... \', \'At the end of the new list that appears, tap "Time." Select "Time" at the end. ... \', \'Select "Reverse" — you\\\'ll then see a preview of your new, reversed video appear on the screen.\']',
479
+ 'Relative age is the age of a rock layer (or the fossils it contains) compared to other layers. It can be determined by looking at the position of rock layers. Absolute age is the numeric age of a layer of rocks or fossils. Absolute age can be determined by using radiometric dating.',
480
+ ]
481
+ embeddings = model.encode(sentences)
482
+ print(embeddings.shape)
483
+ # [3, 1024]
484
+
485
+ # Get the similarity scores for the embeddings
486
+ similarities = model.similarity(embeddings, embeddings)
487
+ print(similarities.shape)
488
+ # [3, 3]
489
+ ```
490
+
491
+ <!--
492
+ ### Direct Usage (Transformers)
493
+
494
+ <details><summary>Click to see the direct usage in Transformers</summary>
495
+
496
+ </details>
497
+ -->
498
+
499
+ <!--
500
+ ### Downstream Usage (Sentence Transformers)
501
+
502
+ You can finetune this model on your own dataset.
503
+
504
+ <details><summary>Click to expand</summary>
505
+
506
+ </details>
507
+ -->
508
+
509
+ <!--
510
+ ### Out-of-Scope Use
511
+
512
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
513
+ -->
514
+
515
+ ## Evaluation
516
+
517
+ ### Metrics
518
+
519
+ #### Information Retrieval
520
+
521
+ * Dataset: `gooaq-1024-dev`
522
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
523
+ ```json
524
+ {
525
+ "truncate_dim": 1024
526
+ }
527
+ ```
528
+
529
+ | Metric | Value |
530
+ |:--------------------|:-----------|
531
+ | cosine_accuracy@1 | 0.6335 |
532
+ | cosine_accuracy@3 | 0.8394 |
533
+ | cosine_accuracy@5 | 0.8979 |
534
+ | cosine_accuracy@10 | 0.9454 |
535
+ | cosine_precision@1 | 0.6335 |
536
+ | cosine_precision@3 | 0.2798 |
537
+ | cosine_precision@5 | 0.1796 |
538
+ | cosine_precision@10 | 0.0945 |
539
+ | cosine_recall@1 | 0.6335 |
540
+ | cosine_recall@3 | 0.8394 |
541
+ | cosine_recall@5 | 0.8979 |
542
+ | cosine_recall@10 | 0.9454 |
543
+ | **cosine_ndcg@10** | **0.7949** |
544
+ | cosine_mrr@10 | 0.7459 |
545
+ | cosine_map@100 | 0.7484 |
546
+
547
+ #### Information Retrieval
548
+
549
+ * Dataset: `gooaq-512-dev`
550
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
551
+ ```json
552
+ {
553
+ "truncate_dim": 512
554
+ }
555
+ ```
556
+
557
+ | Metric | Value |
558
+ |:--------------------|:-----------|
559
+ | cosine_accuracy@1 | 0.6285 |
560
+ | cosine_accuracy@3 | 0.8339 |
561
+ | cosine_accuracy@5 | 0.8943 |
562
+ | cosine_accuracy@10 | 0.9425 |
563
+ | cosine_precision@1 | 0.6285 |
564
+ | cosine_precision@3 | 0.278 |
565
+ | cosine_precision@5 | 0.1789 |
566
+ | cosine_precision@10 | 0.0943 |
567
+ | cosine_recall@1 | 0.6285 |
568
+ | cosine_recall@3 | 0.8339 |
569
+ | cosine_recall@5 | 0.8943 |
570
+ | cosine_recall@10 | 0.9425 |
571
+ | **cosine_ndcg@10** | **0.7907** |
572
+ | cosine_mrr@10 | 0.7414 |
573
+ | cosine_map@100 | 0.744 |
574
+
575
+ #### Information Retrieval
576
+
577
+ * Dataset: `gooaq-256-dev`
578
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
579
+ ```json
580
+ {
581
+ "truncate_dim": 256
582
+ }
583
+ ```
584
+
585
+ | Metric | Value |
586
+ |:--------------------|:----------|
587
+ | cosine_accuracy@1 | 0.6196 |
588
+ | cosine_accuracy@3 | 0.8262 |
589
+ | cosine_accuracy@5 | 0.888 |
590
+ | cosine_accuracy@10 | 0.9375 |
591
+ | cosine_precision@1 | 0.6196 |
592
+ | cosine_precision@3 | 0.2754 |
593
+ | cosine_precision@5 | 0.1776 |
594
+ | cosine_precision@10 | 0.0938 |
595
+ | cosine_recall@1 | 0.6196 |
596
+ | cosine_recall@3 | 0.8262 |
597
+ | cosine_recall@5 | 0.888 |
598
+ | cosine_recall@10 | 0.9375 |
599
+ | **cosine_ndcg@10** | **0.783** |
600
+ | cosine_mrr@10 | 0.7328 |
601
+ | cosine_map@100 | 0.7356 |
602
+
603
+ #### Information Retrieval
604
+
605
+ * Dataset: `gooaq-128-dev`
606
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
607
+ ```json
608
+ {
609
+ "truncate_dim": 128
610
+ }
611
+ ```
612
+
613
+ | Metric | Value |
614
+ |:--------------------|:-----------|
615
+ | cosine_accuracy@1 | 0.597 |
616
+ | cosine_accuracy@3 | 0.8033 |
617
+ | cosine_accuracy@5 | 0.8681 |
618
+ | cosine_accuracy@10 | 0.9247 |
619
+ | cosine_precision@1 | 0.597 |
620
+ | cosine_precision@3 | 0.2678 |
621
+ | cosine_precision@5 | 0.1736 |
622
+ | cosine_precision@10 | 0.0925 |
623
+ | cosine_recall@1 | 0.597 |
624
+ | cosine_recall@3 | 0.8033 |
625
+ | cosine_recall@5 | 0.8681 |
626
+ | cosine_recall@10 | 0.9247 |
627
+ | **cosine_ndcg@10** | **0.7633** |
628
+ | cosine_mrr@10 | 0.7112 |
629
+ | cosine_map@100 | 0.7143 |
630
+
631
+ #### Information Retrieval
632
+
633
+ * Dataset: `gooaq-64-dev`
634
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
635
+ ```json
636
+ {
637
+ "truncate_dim": 64
638
+ }
639
+ ```
640
+
641
+ | Metric | Value |
642
+ |:--------------------|:-----------|
643
+ | cosine_accuracy@1 | 0.5541 |
644
+ | cosine_accuracy@3 | 0.7568 |
645
+ | cosine_accuracy@5 | 0.8287 |
646
+ | cosine_accuracy@10 | 0.896 |
647
+ | cosine_precision@1 | 0.5541 |
648
+ | cosine_precision@3 | 0.2523 |
649
+ | cosine_precision@5 | 0.1657 |
650
+ | cosine_precision@10 | 0.0896 |
651
+ | cosine_recall@1 | 0.5541 |
652
+ | cosine_recall@3 | 0.7568 |
653
+ | cosine_recall@5 | 0.8287 |
654
+ | cosine_recall@10 | 0.896 |
655
+ | **cosine_ndcg@10** | **0.7246** |
656
+ | cosine_mrr@10 | 0.6697 |
657
+ | cosine_map@100 | 0.6736 |
658
+
659
+ #### Information Retrieval
660
+
661
+ * Dataset: `gooaq-32-dev`
662
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
663
+ ```json
664
+ {
665
+ "truncate_dim": 32
666
+ }
667
+ ```
668
+
669
+ | Metric | Value |
670
+ |:--------------------|:-----------|
671
+ | cosine_accuracy@1 | 0.4602 |
672
+ | cosine_accuracy@3 | 0.6631 |
673
+ | cosine_accuracy@5 | 0.7372 |
674
+ | cosine_accuracy@10 | 0.8226 |
675
+ | cosine_precision@1 | 0.4602 |
676
+ | cosine_precision@3 | 0.221 |
677
+ | cosine_precision@5 | 0.1474 |
678
+ | cosine_precision@10 | 0.0823 |
679
+ | cosine_recall@1 | 0.4602 |
680
+ | cosine_recall@3 | 0.6631 |
681
+ | cosine_recall@5 | 0.7372 |
682
+ | cosine_recall@10 | 0.8226 |
683
+ | **cosine_ndcg@10** | **0.6372** |
684
+ | cosine_mrr@10 | 0.5783 |
685
+ | cosine_map@100 | 0.5841 |
686
+
687
+ <!--
688
+ ## Bias, Risks and Limitations
689
+
690
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
691
+ -->
692
+
693
+ <!--
694
+ ### Recommendations
695
+
696
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
697
+ -->
698
+
699
+ ## Training Details
700
+
701
+ ### Training Dataset
702
+
703
+ #### gooaq
704
+
705
+ * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
706
+ * Size: 3,002,496 training samples
707
+ * Columns: <code>question</code> and <code>answer</code>
708
+ * Approximate statistics based on the first 1000 samples:
709
+ | | question | answer |
710
+ |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
711
+ | type | string | string |
712
+ | details | <ul><li>min: 18 characters</li><li>mean: 43.23 characters</li><li>max: 96 characters</li></ul> | <ul><li>min: 55 characters</li><li>mean: 253.36 characters</li><li>max: 371 characters</li></ul> |
713
+ * Samples:
714
+ | question | answer |
715
+ |:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
716
+ | <code>what is the difference between broilers and layers?</code> | <code>An egg laying poultry is called egger or layer whereas broilers are reared for obtaining meat. So a layer should be able to produce more number of large sized eggs, without growing too much. On the other hand, a broiler should yield more meat and hence should be able to grow well.</code> |
717
+ | <code>what is the difference between chronological order and spatial order?</code> | <code>As a writer, you should always remember that unlike chronological order and the other organizational methods for data, spatial order does not take into account the time. Spatial order is primarily focused on the location. All it does is take into account the location of objects and not the time.</code> |
718
+ | <code>is kamagra same as viagra?</code> | <code>Kamagra is thought to contain the same active ingredient as Viagra, sildenafil citrate. In theory, it should work in much the same way as Viagra, taking about 45 minutes to take effect, and lasting for around 4-6 hours. However, this will vary from person to person.</code> |
719
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
720
+ ```json
721
+ {
722
+ "loss": "MultipleNegativesRankingLoss",
723
+ "matryoshka_dims": [
724
+ 1024,
725
+ 512,
726
+ 256,
727
+ 128,
728
+ 64,
729
+ 32
730
+ ],
731
+ "matryoshka_weights": [
732
+ 1,
733
+ 1,
734
+ 1,
735
+ 1,
736
+ 1,
737
+ 1
738
+ ],
739
+ "n_dims_per_step": -1
740
+ }
741
+ ```
742
+
743
+ ### Evaluation Dataset
744
+
745
+ #### gooaq
746
+
747
+ * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
748
+ * Size: 10,000 evaluation samples
749
+ * Columns: <code>question</code> and <code>answer</code>
750
+ * Approximate statistics based on the first 1000 samples:
751
+ | | question | answer |
752
+ |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
753
+ | type | string | string |
754
+ | details | <ul><li>min: 18 characters</li><li>mean: 43.17 characters</li><li>max: 98 characters</li></ul> | <ul><li>min: 51 characters</li><li>mean: 254.12 characters</li><li>max: 360 characters</li></ul> |
755
+ * Samples:
756
+ | question | answer |
757
+ |:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
758
+ | <code>how do i program my directv remote with my tv?</code> | <code>['Press MENU on your remote.', 'Select Settings & Help > Settings > Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete programming.']</code> |
759
+ | <code>are rodrigues fruit bats nocturnal?</code> | <code>Before its numbers were threatened by habitat destruction, storms, and hunting, some of those groups could number 500 or more members. Sunrise, sunset. Rodrigues fruit bats are most active at dawn, at dusk, and at night.</code> |
760
+ | <code>why does your heart rate increase during exercise bbc bitesize?</code> | <code>During exercise there is an increase in physical activity and muscle cells respire more than they do when the body is at rest. The heart rate increases during exercise. The rate and depth of breathing increases - this makes sure that more oxygen is absorbed into the blood, and more carbon dioxide is removed from it.</code> |
761
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
762
+ ```json
763
+ {
764
+ "loss": "MultipleNegativesRankingLoss",
765
+ "matryoshka_dims": [
766
+ 1024,
767
+ 512,
768
+ 256,
769
+ 128,
770
+ 64,
771
+ 32
772
+ ],
773
+ "matryoshka_weights": [
774
+ 1,
775
+ 1,
776
+ 1,
777
+ 1,
778
+ 1,
779
+ 1
780
+ ],
781
+ "n_dims_per_step": -1
782
+ }
783
+ ```
784
+
785
+ ### Training Hyperparameters
786
+ #### Non-Default Hyperparameters
787
+
788
+ - `eval_strategy`: steps
789
+ - `per_device_train_batch_size`: 2048
790
+ - `per_device_eval_batch_size`: 2048
791
+ - `learning_rate`: 0.2
792
+ - `num_train_epochs`: 1
793
+ - `warmup_ratio`: 0.1
794
+ - `bf16`: True
795
+ - `batch_sampler`: no_duplicates
796
+
797
+ #### All Hyperparameters
798
+ <details><summary>Click to expand</summary>
799
+
800
+ - `overwrite_output_dir`: False
801
+ - `do_predict`: False
802
+ - `eval_strategy`: steps
803
+ - `prediction_loss_only`: True
804
+ - `per_device_train_batch_size`: 2048
805
+ - `per_device_eval_batch_size`: 2048
806
+ - `per_gpu_train_batch_size`: None
807
+ - `per_gpu_eval_batch_size`: None
808
+ - `gradient_accumulation_steps`: 1
809
+ - `eval_accumulation_steps`: None
810
+ - `torch_empty_cache_steps`: None
811
+ - `learning_rate`: 0.2
812
+ - `weight_decay`: 0.0
813
+ - `adam_beta1`: 0.9
814
+ - `adam_beta2`: 0.999
815
+ - `adam_epsilon`: 1e-08
816
+ - `max_grad_norm`: 1.0
817
+ - `num_train_epochs`: 1
818
+ - `max_steps`: -1
819
+ - `lr_scheduler_type`: linear
820
+ - `lr_scheduler_kwargs`: {}
821
+ - `warmup_ratio`: 0.1
822
+ - `warmup_steps`: 0
823
+ - `log_level`: passive
824
+ - `log_level_replica`: warning
825
+ - `log_on_each_node`: True
826
+ - `logging_nan_inf_filter`: True
827
+ - `save_safetensors`: True
828
+ - `save_on_each_node`: False
829
+ - `save_only_model`: False
830
+ - `restore_callback_states_from_checkpoint`: False
831
+ - `no_cuda`: False
832
+ - `use_cpu`: False
833
+ - `use_mps_device`: False
834
+ - `seed`: 42
835
+ - `data_seed`: None
836
+ - `jit_mode_eval`: False
837
+ - `use_ipex`: False
838
+ - `bf16`: True
839
+ - `fp16`: False
840
+ - `fp16_opt_level`: O1
841
+ - `half_precision_backend`: auto
842
+ - `bf16_full_eval`: False
843
+ - `fp16_full_eval`: False
844
+ - `tf32`: None
845
+ - `local_rank`: 0
846
+ - `ddp_backend`: None
847
+ - `tpu_num_cores`: None
848
+ - `tpu_metrics_debug`: False
849
+ - `debug`: []
850
+ - `dataloader_drop_last`: False
851
+ - `dataloader_num_workers`: 0
852
+ - `dataloader_prefetch_factor`: None
853
+ - `past_index`: -1
854
+ - `disable_tqdm`: False
855
+ - `remove_unused_columns`: True
856
+ - `label_names`: None
857
+ - `load_best_model_at_end`: False
858
+ - `ignore_data_skip`: False
859
+ - `fsdp`: []
860
+ - `fsdp_min_num_params`: 0
861
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
862
+ - `fsdp_transformer_layer_cls_to_wrap`: None
863
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
864
+ - `deepspeed`: None
865
+ - `label_smoothing_factor`: 0.0
866
+ - `optim`: adamw_torch
867
+ - `optim_args`: None
868
+ - `adafactor`: False
869
+ - `group_by_length`: False
870
+ - `length_column_name`: length
871
+ - `ddp_find_unused_parameters`: None
872
+ - `ddp_bucket_cap_mb`: None
873
+ - `ddp_broadcast_buffers`: False
874
+ - `dataloader_pin_memory`: True
875
+ - `dataloader_persistent_workers`: False
876
+ - `skip_memory_metrics`: True
877
+ - `use_legacy_prediction_loop`: False
878
+ - `push_to_hub`: False
879
+ - `resume_from_checkpoint`: None
880
+ - `hub_model_id`: None
881
+ - `hub_strategy`: every_save
882
+ - `hub_private_repo`: None
883
+ - `hub_always_push`: False
884
+ - `gradient_checkpointing`: False
885
+ - `gradient_checkpointing_kwargs`: None
886
+ - `include_inputs_for_metrics`: False
887
+ - `include_for_metrics`: []
888
+ - `eval_do_concat_batches`: True
889
+ - `fp16_backend`: auto
890
+ - `push_to_hub_model_id`: None
891
+ - `push_to_hub_organization`: None
892
+ - `mp_parameters`:
893
+ - `auto_find_batch_size`: False
894
+ - `full_determinism`: False
895
+ - `torchdynamo`: None
896
+ - `ray_scope`: last
897
+ - `ddp_timeout`: 1800
898
+ - `torch_compile`: False
899
+ - `torch_compile_backend`: None
900
+ - `torch_compile_mode`: None
901
+ - `dispatch_batches`: None
902
+ - `split_batches`: None
903
+ - `include_tokens_per_second`: False
904
+ - `include_num_input_tokens_seen`: False
905
+ - `neftune_noise_alpha`: None
906
+ - `optim_target_modules`: None
907
+ - `batch_eval_metrics`: False
908
+ - `eval_on_start`: False
909
+ - `use_liger_kernel`: False
910
+ - `eval_use_gather_object`: False
911
+ - `average_tokens_across_devices`: False
912
+ - `prompts`: None
913
+ - `batch_sampler`: no_duplicates
914
+ - `multi_dataset_batch_sampler`: proportional
915
+
916
+ </details>
917
+
918
+ ### Training Logs
919
+ | Epoch | Step | Training Loss | Validation Loss | gooaq-1024-dev_cosine_ndcg@10 | gooaq-512-dev_cosine_ndcg@10 | gooaq-256-dev_cosine_ndcg@10 | gooaq-128-dev_cosine_ndcg@10 | gooaq-64-dev_cosine_ndcg@10 | gooaq-32-dev_cosine_ndcg@10 |
920
+ |:------:|:----:|:-------------:|:---------------:|:-----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:|:---------------------------:|:---------------------------:|
921
+ | -1 | -1 | - | - | 0.2340 | 0.2217 | 0.1954 | 0.1493 | 0.0863 | 0.0339 |
922
+ | 0.0007 | 1 | 35.6378 | - | - | - | - | - | - | - |
923
+ | 0.0682 | 100 | 16.3559 | - | - | - | - | - | - | - |
924
+ | 0.1363 | 200 | 6.0576 | - | - | - | - | - | - | - |
925
+ | 0.1704 | 250 | - | 1.6966 | 0.7315 | 0.7266 | 0.7170 | 0.6895 | 0.6443 | 0.5363 |
926
+ | 0.2045 | 300 | 4.9232 | - | - | - | - | - | - | - |
927
+ | 0.2727 | 400 | 4.4397 | - | - | - | - | - | - | - |
928
+ | 0.3408 | 500 | 4.1373 | 1.4008 | 0.7613 | 0.7561 | 0.7459 | 0.7253 | 0.6838 | 0.5866 |
929
+ | 0.4090 | 600 | 3.8967 | - | - | - | - | - | - | - |
930
+ | 0.4772 | 700 | 3.732 | - | - | - | - | - | - | - |
931
+ | 0.5112 | 750 | - | 1.2860 | 0.7749 | 0.7708 | 0.7630 | 0.7413 | 0.7017 | 0.6096 |
932
+ | 0.5453 | 800 | 3.6054 | - | - | - | - | - | - | - |
933
+ | 0.6135 | 900 | 3.4792 | - | - | - | - | - | - | - |
934
+ | 0.6817 | 1000 | 3.4143 | 1.1877 | 0.7847 | 0.7806 | 0.7729 | 0.7524 | 0.7119 | 0.6212 |
935
+ | 0.7498 | 1100 | 3.3194 | - | - | - | - | - | - | - |
936
+ | 0.8180 | 1200 | 3.2469 | - | - | - | - | - | - | - |
937
+ | 0.8521 | 1250 | - | 1.1253 | 0.7928 | 0.7888 | 0.7805 | 0.7612 | 0.7221 | 0.6337 |
938
+ | 0.8862 | 1300 | 3.2015 | - | - | - | - | - | - | - |
939
+ | 0.9543 | 1400 | 3.1689 | - | - | - | - | - | - | - |
940
+ | -1 | -1 | - | - | 0.7949 | 0.7907 | 0.7830 | 0.7633 | 0.7246 | 0.6372 |
941
+
942
+
943
+ ### Environmental Impact
944
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
945
+ - **Energy Consumed**: 0.019 kWh
946
+ - **Carbon Emitted**: 0.007 kg of CO2
947
+ - **Hours Used**: 0.124 hours
948
+
949
+ ### Training Hardware
950
+ - **On Cloud**: No
951
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
952
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
953
+ - **RAM Size**: 31.78 GB
954
+
955
+ ### Framework Versions
956
+ - Python: 3.11.6
957
+ - Sentence Transformers: 4.2.0.dev0
958
+ - Transformers: 4.49.0
959
+ - PyTorch: 2.6.0+cu124
960
+ - Accelerate: 1.5.1
961
+ - Datasets: 2.21.0
962
+ - Tokenizers: 0.21.1
963
+
964
+ ## Citation
965
+
966
+ ### BibTeX
967
+
968
+ #### Sentence Transformers
969
+ ```bibtex
970
+ @inproceedings{reimers-2019-sentence-bert,
971
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
972
+ author = "Reimers, Nils and Gurevych, Iryna",
973
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
974
+ month = "11",
975
+ year = "2019",
976
+ publisher = "Association for Computational Linguistics",
977
+ url = "https://arxiv.org/abs/1908.10084",
978
+ }
979
+ ```
980
+
981
+ #### MatryoshkaLoss
982
+ ```bibtex
983
+ @misc{kusupati2024matryoshka,
984
+ title={Matryoshka Representation Learning},
985
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
986
+ year={2024},
987
+ eprint={2205.13147},
988
+ archivePrefix={arXiv},
989
+ primaryClass={cs.LG}
990
+ }
991
+ ```
992
+
993
+ #### MultipleNegativesRankingLoss
994
+ ```bibtex
995
+ @misc{henderson2017efficient,
996
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
997
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
998
+ year={2017},
999
+ eprint={1705.00652},
1000
+ archivePrefix={arXiv},
1001
+ primaryClass={cs.CL}
1002
+ }
1003
+ ```
1004
+
1005
+ <!--
1006
+ ## Glossary
1007
+
1008
+ *Clearly define terms in order to be accessible across audiences.*
1009
+ -->
1010
+
1011
+ <!--
1012
+ ## Model Card Authors
1013
+
1014
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1015
+ -->
1016
+
1017
+ <!--
1018
+ ## Model Card Contact
1019
+
1020
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1021
+ -->
config_sentence_transformers.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "4.2.0.dev0",
5
+ "transformers": "4.49.0",
6
+ "pytorch": "2.6.0+cu124"
7
+ },
8
+ "prompts": {},
9
+ "default_prompt_name": null,
10
+ "similarity_fn_name": "cosine"
11
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07d6d8645bb4b78e55fa99f979b6f43cc2537fb15235b3b4955d81222456f971
3
+ size 131068000
modules.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.StaticEmbedding"
7
+ }
8
+ ]
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff