aditeyabaral-redis's picture
Add new CrossEncoder model
5c8f0a8 verified
---
language:
- en
license: apache-2.0
tags:
- cross-encoder
- sentence-transformers
- text-classification
- sentence-pair-classification
- semantic-similarity
- semantic-search
- retrieval
- reranking
- generated_from_trainer
- dataset_size:1047690
- loss:BinaryCrossEntropyLoss
base_model: Alibaba-NLP/gte-reranker-modernbert-base
datasets:
- aditeyabaral-redis/langcache-sentencepairs
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- accuracy
- accuracy_threshold
- f1
- f1_threshold
- precision
- recall
- average_precision
model-index:
- name: Redis fine-tuned CrossEncoder model for semantic caching on LangCache
results:
- task:
type: cross-encoder-classification
name: Cross Encoder Classification
dataset:
name: val
type: val
metrics:
- type: accuracy
value: 0.77180249851279
name: Accuracy
- type: accuracy_threshold
value: 0.8926752805709839
name: Accuracy Threshold
- type: f1
value: 0.6933947772657449
name: F1
- type: f1_threshold
value: 0.8759380578994751
name: F1 Threshold
- type: precision
value: 0.678796992481203
name: Precision
- type: recall
value: 0.7086342229199372
name: Recall
- type: average_precision
value: 0.7676424589681807
name: Average Precision
- task:
type: cross-encoder-classification
name: Cross Encoder Classification
dataset:
name: test
type: test
metrics:
- type: accuracy
value: 0.8947292046242402
name: Accuracy
- type: accuracy_threshold
value: 0.8615613579750061
name: Accuracy Threshold
- type: f1
value: 0.8797439414723366
name: F1
- type: f1_threshold
value: 0.503699541091919
name: F1 Threshold
- type: precision
value: 0.8643306379155435
name: Precision
- type: recall
value: 0.8957169459962756
name: Recall
- type: average_precision
value: 0.934515467879065
name: Average Precision
---
# Redis fine-tuned CrossEncoder model for semantic caching on LangCache
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [Alibaba-NLP/gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) on the [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/aditeyabaral-redis/langcache-sentencepairs) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for sentence pair classification.
## Model Details
### Model Description
- **Model Type:** Cross Encoder
- **Base model:** [Alibaba-NLP/gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) <!-- at revision f7481e6055501a30fb19d090657df9ec1f79ab2c -->
- **Maximum Sequence Length:** 8192 tokens
- **Number of Output Labels:** 1 label
- **Training Dataset:**
- [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/aditeyabaral-redis/langcache-sentencepairs)
- **Language:** en
- **License:** apache-2.0
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("aditeyabaral-redis/langcache-reranker-v1")
# Get scores for pairs of texts
pairs = [
['The newer Punts are still very much in existence today and race in the same fleets as the older boats .', 'The newer punts are still very much in existence today and run in the same fleets as the older boats .'],
['Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .', 'Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .'],
['After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .', 'Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .'],
['She married Peter Haygarth on 29 May 1964 in Durban . Her second marriage , to Robin Osborne , took place in 1977 .', 'She married Robin Osborne on May 29 , 1964 in Durban , and her second marriage with Peter Haygarth took place in 1977 .'],
['In 2005 she moved to Norway , settled in Geilo and worked as a rafting guide , in 2006 she started mountain biking - races .', 'In 2005 , she moved to Geilo , settling in Norway and worked as a rafting guide . She started mountain bike races in 2006 .'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'The newer Punts are still very much in existence today and race in the same fleets as the older boats .',
[
'The newer punts are still very much in existence today and run in the same fleets as the older boats .',
'Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .',
'Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .',
'She married Robin Osborne on May 29 , 1964 in Durban , and her second marriage with Peter Haygarth took place in 1977 .',
'In 2005 , she moved to Geilo , settling in Norway and worked as a rafting guide . She started mountain bike races in 2006 .',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Cross Encoder Classification
* Datasets: `val` and `test`
* Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
| Metric | val | test |
|:----------------------|:-----------|:-----------|
| accuracy | 0.7718 | 0.8947 |
| accuracy_threshold | 0.8927 | 0.8616 |
| f1 | 0.6934 | 0.8797 |
| f1_threshold | 0.8759 | 0.5037 |
| precision | 0.6788 | 0.8643 |
| recall | 0.7086 | 0.8957 |
| **average_precision** | **0.7676** | **0.9345** |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### LangCache Sentence Pairs (all)
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/aditeyabaral-redis/langcache-sentencepairs)
* Size: 62,021 training samples
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | sentence1 | sentence2 | label |
|:--------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:------------------------------------------------|
| type | string | string | int |
| details | <ul><li>min: 27 characters</li><li>mean: 112.72 characters</li><li>max: 197 characters</li></ul> | <ul><li>min: 27 characters</li><li>mean: 112.54 characters</li><li>max: 198 characters</li></ul> | <ul><li>0: ~50.30%</li><li>1: ~49.70%</li></ul> |
* Samples:
| sentence1 | sentence2 | label |
|:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
| <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code> | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code> | <code>1</code> |
| <code>Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .</code> | <code>0</code> |
| <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
```json
{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
```
### Evaluation Dataset
#### LangCache Sentence Pairs (all)
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/aditeyabaral-redis/langcache-sentencepairs)
* Size: 62,021 evaluation samples
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | sentence1 | sentence2 | label |
|:--------|:-------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:------------------------------------------------|
| type | string | string | int |
| details | <ul><li>min: 27 characters</li><li>mean: 112.72 characters</li><li>max: 197 characters</li></ul> | <ul><li>min: 27 characters</li><li>mean: 112.54 characters</li><li>max: 198 characters</li></ul> | <ul><li>0: ~50.30%</li><li>1: ~49.70%</li></ul> |
* Samples:
| sentence1 | sentence2 | label |
|:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
| <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code> | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code> | <code>1</code> |
| <code>Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .</code> | <code>0</code> |
| <code>After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .</code> | <code>Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .</code> | <code>1</code> |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
```json
{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 48
- `per_device_eval_batch_size`: 48
- `learning_rate`: 0.0002
- `num_train_epochs`: 50
- `warmup_steps`: 1000
- `load_best_model_at_end`: True
- `optim`: adamw_torch
- `push_to_hub`: True
- `hub_model_id`: aditeyabaral-redis/langcache-reranker-v1
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 48
- `per_device_eval_batch_size`: 48
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 0.0002
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 50
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 1000
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: True
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: True
- `resume_from_checkpoint`: None
- `hub_model_id`: aditeyabaral-redis/langcache-reranker-v1
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: True
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}
</details>
### Training Logs
<details><summary>Click to expand</summary>
| Epoch | Step | Training Loss | Validation Loss | val_average_precision | test_average_precision |
|:----------:|:----------:|:-------------:|:---------------:|:---------------------:|:----------------------:|
| -1 | -1 | - | - | 0.7676 | 0.6907 |
| 0.1833 | 1000 | 0.2986 | 0.3912 | - | 0.8585 |
| 0.3666 | 2000 | 0.2465 | 0.3856 | - | 0.8956 |
| 0.5499 | 3000 | 0.2287 | 0.3362 | - | 0.9160 |
| 0.7331 | 4000 | 0.2171 | 0.3408 | - | 0.9071 |
| 0.9164 | 5000 | 0.2068 | 0.3182 | - | 0.9220 |
| 1.0997 | 6000 | 0.1991 | 0.3458 | - | 0.8686 |
| 1.2830 | 7000 | 0.1939 | 0.3188 | - | 0.9244 |
| 1.4663 | 8000 | 0.1917 | 0.3120 | - | 0.9287 |
| 1.6496 | 9000 | 0.1906 | 0.3015 | - | 0.9279 |
| 1.8328 | 10000 | 0.1884 | 0.2986 | - | 0.9316 |
| 2.0161 | 11000 | 0.183 | 0.3065 | - | 0.9320 |
| 2.1994 | 12000 | 0.1714 | 0.3046 | - | 0.9180 |
| 2.3827 | 13000 | 0.1738 | 0.2994 | - | 0.9315 |
| 2.5660 | 14000 | 0.1709 | 0.2965 | - | 0.9347 |
| 2.7493 | 15000 | 0.1717 | 0.2911 | - | 0.9309 |
| 2.9326 | 16000 | 0.1698 | 0.2900 | - | 0.9354 |
| 3.1158 | 17000 | 0.16 | 0.2894 | - | 0.9377 |
| 3.2991 | 18000 | 0.1589 | 0.2830 | - | 0.9356 |
| 3.4824 | 19000 | 0.1574 | 0.2829 | - | 0.9337 |
| 3.6657 | 20000 | 0.1572 | 0.2818 | - | 0.9324 |
| 3.8490 | 21000 | 0.1587 | 0.2866 | - | 0.9365 |
| 4.0323 | 22000 | 0.1543 | 0.2923 | - | 0.9389 |
| 4.2155 | 23000 | 0.1445 | 0.2871 | - | 0.9430 |
| 4.3988 | 24000 | 0.1447 | 0.2793 | - | 0.9429 |
| 4.5821 | 25000 | 0.1473 | 0.2791 | - | 0.9386 |
| 4.7654 | 26000 | 0.146 | 0.2700 | - | 0.9417 |
| 4.9487 | 27000 | 0.1473 | 0.2697 | - | 0.9419 |
| 5.1320 | 28000 | 0.1365 | 0.2810 | - | 0.9411 |
| 5.3152 | 29000 | 0.1331 | 0.2764 | - | 0.9397 |
| 5.4985 | 30000 | 0.1372 | 0.2794 | - | 0.9416 |
| 5.6818 | 31000 | 0.1365 | 0.2751 | - | 0.9408 |
| 5.8651 | 32000 | 0.1365 | 0.2724 | - | 0.9411 |
| 6.0484 | 33000 | 0.1348 | 0.2767 | - | 0.9378 |
| 6.2317 | 34000 | 0.1236 | 0.2840 | - | 0.9388 |
| 6.4150 | 35000 | 0.1262 | 0.2845 | - | 0.9437 |
| 6.5982 | 36000 | 0.1277 | 0.2781 | - | 0.9446 |
| 6.7815 | 37000 | 0.129 | 0.2705 | - | 0.9427 |
| 6.9648 | 38000 | 0.1279 | 0.2773 | - | 0.9381 |
| 7.1481 | 39000 | 0.1173 | 0.2875 | - | 0.9420 |
| 7.3314 | 40000 | 0.1175 | 0.2901 | - | 0.9438 |
| 7.5147 | 41000 | 0.1174 | 0.2787 | - | 0.9420 |
| 7.6979 | 42000 | 0.118 | 0.2879 | - | 0.9424 |
| 7.8812 | 43000 | 0.1201 | 0.2826 | - | 0.9450 |
| 8.0645 | 44000 | 0.1168 | 0.2851 | - | 0.9419 |
| 8.2478 | 45000 | 0.1062 | 0.2913 | - | 0.9450 |
| 8.4311 | 46000 | 0.1091 | 0.2918 | - | 0.9454 |
| 8.6144 | 47000 | 0.1117 | 0.2799 | - | 0.9445 |
| 8.7977 | 48000 | 0.1123 | 0.2762 | - | 0.9443 |
| 8.9809 | 49000 | 0.1132 | 0.2772 | - | 0.9455 |
| 9.1642 | 50000 | 0.1016 | 0.2943 | - | 0.9433 |
| 9.3475 | 51000 | 0.1012 | 0.2879 | - | 0.9441 |
| 9.5308 | 52000 | 0.1029 | 0.2851 | - | 0.9442 |
| 9.7141 | 53000 | 0.105 | 0.2905 | - | 0.9448 |
| 9.8974 | 54000 | 0.1062 | 0.2960 | - | 0.9425 |
| 10.0806 | 55000 | 0.0996 | 0.2984 | - | 0.9430 |
| 10.2639 | 56000 | 0.0924 | 0.2947 | - | 0.9432 |
| 10.4472 | 57000 | 0.0939 | 0.2918 | - | 0.9421 |
| 10.6305 | 58000 | 0.0977 | 0.2895 | - | 0.9438 |
| 10.8138 | 59000 | 0.0977 | 0.2905 | - | 0.9446 |
| 10.9971 | 60000 | 0.0985 | 0.2882 | - | 0.9403 |
| 11.1804 | 61000 | 0.0857 | 0.3025 | - | 0.9435 |
| 11.3636 | 62000 | 0.0869 | 0.2997 | - | 0.9450 |
| 11.5469 | 63000 | 0.0886 | 0.3025 | - | 0.9459 |
| 11.7302 | 64000 | 0.0901 | 0.3000 | - | 0.9443 |
| 11.9135 | 65000 | 0.092 | 0.2913 | - | 0.9424 |
| 12.0968 | 66000 | 0.085 | 0.3017 | - | 0.9443 |
| 12.2801 | 67000 | 0.0801 | 0.3101 | - | 0.9449 |
| 12.4633 | 68000 | 0.0823 | 0.3018 | - | 0.9468 |
| 12.6466 | 69000 | 0.0841 | 0.2971 | - | 0.9457 |
| 12.8299 | 70000 | 0.0855 | 0.3063 | - | 0.9428 |
| 13.0132 | 71000 | 0.0854 | 0.3105 | - | 0.9436 |
| 13.1965 | 72000 | 0.0744 | 0.3017 | - | 0.9451 |
| 13.3798 | 73000 | 0.0763 | 0.3024 | - | 0.9425 |
| 13.5630 | 74000 | 0.0777 | 0.2948 | - | 0.9461 |
| 13.7463 | 75000 | 0.0791 | 0.3006 | - | 0.9466 |
| 13.9296 | 76000 | 0.0803 | 0.3001 | - | 0.9446 |
| 14.1129 | 77000 | 0.0721 | 0.3229 | - | 0.9445 |
| 14.2962 | 78000 | 0.0692 | 0.3231 | - | 0.9437 |
| 14.4795 | 79000 | 0.0703 | 0.3242 | - | 0.9458 |
| 14.6628 | 80000 | 0.073 | 0.3078 | - | 0.9469 |
| 14.8460 | 81000 | 0.073 | 0.3111 | - | 0.9448 |
| 15.0293 | 82000 | 0.0731 | 0.3319 | - | 0.9459 |
| 15.2126 | 83000 | 0.0629 | 0.3094 | - | 0.9464 |
| 15.3959 | 84000 | 0.0644 | 0.3440 | - | 0.9427 |
| 15.5792 | 85000 | 0.0673 | 0.3234 | - | 0.9457 |
| 15.7625 | 86000 | 0.068 | 0.3192 | - | 0.9430 |
| 15.9457 | 87000 | 0.0687 | 0.3097 | - | 0.9428 |
| 16.1290 | 88000 | 0.0618 | 0.3379 | - | 0.9466 |
| 16.3123 | 89000 | 0.0615 | 0.3192 | - | 0.9436 |
| 16.4956 | 90000 | 0.0605 | 0.3303 | - | 0.9452 |
| 16.6789 | 91000 | 0.0635 | 0.3154 | - | 0.9445 |
| 16.8622 | 92000 | 0.0637 | 0.3324 | - | 0.9467 |
| 17.0455 | 93000 | 0.0615 | 0.3365 | - | 0.9424 |
| 17.2287 | 94000 | 0.056 | 0.3332 | - | 0.9446 |
| 17.4120 | 95000 | 0.0567 | 0.3412 | - | 0.9432 |
| 17.5953 | 96000 | 0.0571 | 0.3419 | - | 0.9444 |
| 17.7786 | 97000 | 0.0589 | 0.3271 | - | 0.9403 |
| 17.9619 | 98000 | 0.0588 | 0.3281 | - | 0.9440 |
| 18.1452 | 99000 | 0.053 | 0.3282 | - | 0.9475 |
| 18.3284 | 100000 | 0.0525 | 0.3414 | - | 0.9470 |
| 18.5117 | 101000 | 0.0528 | 0.3263 | - | 0.9450 |
| 18.6950 | 102000 | 0.0539 | 0.3363 | - | 0.9428 |
| 18.8783 | 103000 | 0.056 | 0.3487 | - | 0.9454 |
| 19.0616 | 104000 | 0.0528 | 0.3701 | - | 0.9465 |
| 19.2449 | 105000 | 0.0464 | 0.3877 | - | 0.9328 |
| 19.4282 | 106000 | 0.0499 | 0.3379 | - | 0.9451 |
| 19.6114 | 107000 | 0.0496 | 0.3500 | - | 0.9442 |
| 19.7947 | 108000 | 0.0502 | 0.3420 | - | 0.9444 |
| 19.9780 | 109000 | 0.0519 | 0.3459 | - | 0.9442 |
| 20.1613 | 110000 | 0.0443 | 0.3755 | - | 0.9449 |
| 20.3446 | 111000 | 0.0449 | 0.3588 | - | 0.9447 |
| 20.5279 | 112000 | 0.0448 | 0.3616 | - | 0.9448 |
| 20.7111 | 113000 | 0.0471 | 0.3463 | - | 0.9426 |
| 20.8944 | 114000 | 0.0474 | 0.3784 | - | 0.9400 |
| 21.0777 | 115000 | 0.0451 | 0.3493 | - | 0.9447 |
| 21.2610 | 116000 | 0.0415 | 0.3633 | - | 0.9448 |
| 21.4443 | 117000 | 0.0412 | 0.3635 | - | 0.9472 |
| 21.6276 | 118000 | 0.0441 | 0.3710 | - | 0.9454 |
| 21.8109 | 119000 | 0.0427 | 0.3696 | - | 0.9459 |
| 21.9941 | 120000 | 0.045 | 0.3571 | - | 0.9440 |
| 22.1774 | 121000 | 0.0384 | 0.3815 | - | 0.9431 |
| 22.3607 | 122000 | 0.0389 | 0.3832 | - | 0.9428 |
| 22.5440 | 123000 | 0.0397 | 0.3773 | - | 0.9461 |
| 22.7273 | 124000 | 0.0402 | 0.3977 | - | 0.9415 |
| 22.9106 | 125000 | 0.0399 | 0.3870 | - | 0.9354 |
| 23.0938 | 126000 | 0.0376 | 0.3820 | - | 0.9409 |
| 23.2771 | 127000 | 0.0362 | 0.3755 | - | 0.9411 |
| 23.4604 | 128000 | 0.0358 | 0.3915 | - | 0.9461 |
| 23.6437 | 129000 | 0.0368 | 0.3688 | - | 0.9411 |
| 23.8270 | 130000 | 0.0374 | 0.4068 | - | 0.9427 |
| 24.0103 | 131000 | 0.0376 | 0.4155 | - | 0.9445 |
| 24.1935 | 132000 | 0.0325 | 0.3967 | - | 0.9434 |
| 24.3768 | 133000 | 0.0333 | 0.4209 | - | 0.9425 |
| 24.5601 | 134000 | 0.0335 | 0.4018 | - | 0.9432 |
| 24.7434 | 135000 | 0.0343 | 0.4250 | - | 0.9443 |
| 24.9267 | 136000 | 0.0345 | 0.4185 | - | 0.9414 |
| 25.1100 | 137000 | 0.0316 | 0.4075 | - | 0.9454 |
| 25.2933 | 138000 | 0.0299 | 0.4096 | - | 0.9454 |
| 25.4765 | 139000 | 0.0294 | 0.4135 | - | 0.9459 |
| 25.6598 | 140000 | 0.0317 | 0.3997 | - | 0.9445 |
| 25.8431 | 141000 | 0.0328 | 0.4093 | - | 0.9438 |
| 26.0264 | 142000 | 0.0317 | 0.4361 | - | 0.9404 |
| 26.2097 | 143000 | 0.027 | 0.4347 | - | 0.9454 |
| 26.3930 | 144000 | 0.0281 | 0.4149 | - | 0.9413 |
| 26.5762 | 145000 | 0.0283 | 0.4151 | - | 0.9454 |
| 26.7595 | 146000 | 0.0302 | 0.4041 | - | 0.9416 |
| 26.9428 | 147000 | 0.0301 | 0.4265 | - | 0.9340 |
| 27.1261 | 148000 | 0.026 | 0.4223 | - | 0.9426 |
| 27.3094 | 149000 | 0.0267 | 0.4237 | - | 0.9430 |
| 27.4927 | 150000 | 0.0268 | 0.4281 | - | 0.9458 |
| 27.6760 | 151000 | 0.0262 | 0.4193 | - | 0.9426 |
| 27.8592 | 152000 | 0.0262 | 0.4412 | - | 0.9402 |
| 28.0425 | 153000 | 0.0261 | 0.4795 | - | 0.9425 |
| 28.2258 | 154000 | 0.024 | 0.4519 | - | 0.9442 |
| 28.4091 | 155000 | 0.024 | 0.4395 | - | 0.9440 |
| 28.5924 | 156000 | 0.025 | 0.4549 | - | 0.9456 |
| 28.7757 | 157000 | 0.0253 | 0.4446 | - | 0.9429 |
| 28.9589 | 158000 | 0.0258 | 0.4349 | - | 0.9425 |
| 29.1422 | 159000 | 0.0211 | 0.4490 | - | 0.9430 |
| 29.3255 | 160000 | 0.0218 | 0.4538 | - | 0.9455 |
| 29.5088 | 161000 | 0.0217 | 0.4771 | - | 0.9435 |
| 29.6921 | 162000 | 0.0228 | 0.4238 | - | 0.9440 |
| 29.8754 | 163000 | 0.022 | 0.4731 | - | 0.9412 |
| 30.0587 | 164000 | 0.0227 | 0.4630 | - | 0.9450 |
| 30.2419 | 165000 | 0.0197 | 0.4840 | - | 0.9453 |
| 30.4252 | 166000 | 0.0198 | 0.4799 | - | 0.9434 |
| 30.6085 | 167000 | 0.022 | 0.4650 | - | 0.9453 |
| 30.7918 | 168000 | 0.0211 | 0.4592 | - | 0.9465 |
| 30.9751 | 169000 | 0.022 | 0.4727 | - | 0.9405 |
| 31.1584 | 170000 | 0.0184 | 0.4802 | - | 0.9460 |
| 31.3416 | 171000 | 0.0186 | 0.4953 | - | 0.9449 |
| 31.5249 | 172000 | 0.0187 | 0.4516 | - | 0.9424 |
| 31.7082 | 173000 | 0.019 | 0.4803 | - | 0.9444 |
| 31.8915 | 174000 | 0.0186 | 0.4499 | - | 0.9448 |
| 32.0748 | 175000 | 0.0181 | 0.5211 | - | 0.9377 |
| 32.2581 | 176000 | 0.0163 | 0.4941 | - | 0.9434 |
| 32.4413 | 177000 | 0.0168 | 0.4672 | - | 0.9433 |
| 32.6246 | 178000 | 0.0171 | 0.4990 | - | 0.9414 |
| 32.8079 | 179000 | 0.0185 | 0.4537 | - | 0.9444 |
| 32.9912 | 180000 | 0.0179 | 0.4929 | - | 0.9460 |
| 33.1745 | 181000 | 0.0144 | 0.5037 | - | 0.9407 |
| 33.3578 | 182000 | 0.0143 | 0.4986 | - | 0.9449 |
| 33.5411 | 183000 | 0.016 | 0.5043 | - | 0.9452 |
| 33.7243 | 184000 | 0.0152 | 0.5090 | - | 0.9427 |
| 33.9076 | 185000 | 0.0154 | 0.5100 | - | 0.9414 |
| 34.0909 | 186000 | 0.0146 | 0.5367 | - | 0.9386 |
| 34.2742 | 187000 | 0.0138 | 0.5063 | - | 0.9395 |
| 34.4575 | 188000 | 0.0143 | 0.4871 | - | 0.9446 |
| 34.6408 | 189000 | 0.014 | 0.4947 | - | 0.9483 |
| **34.824** | **190000** | **0.0142** | **0.5079** | **-** | **0.9467** |
| 35.0073 | 191000 | 0.014 | 0.5062 | - | 0.9439 |
| 35.1906 | 192000 | 0.0122 | 0.5293 | - | 0.9410 |
| 35.3739 | 193000 | 0.0127 | 0.5351 | - | 0.9401 |
| 35.5572 | 194000 | 0.0132 | 0.5263 | - | 0.9369 |
| 35.7405 | 195000 | 0.0134 | 0.5300 | - | 0.9427 |
| 35.9238 | 196000 | 0.0138 | 0.5230 | - | 0.9416 |
| 36.1070 | 197000 | 0.0129 | 0.5399 | - | 0.9417 |
| 36.2903 | 198000 | 0.0109 | 0.5352 | - | 0.9433 |
| 36.4736 | 199000 | 0.0114 | 0.5587 | - | 0.9404 |
| 36.6569 | 200000 | 0.012 | 0.5289 | - | 0.9441 |
| 36.8402 | 201000 | 0.012 | 0.5516 | - | 0.9434 |
| 37.0235 | 202000 | 0.0121 | 0.5467 | - | 0.9418 |
| 37.2067 | 203000 | 0.0108 | 0.5499 | - | 0.9412 |
| 37.3900 | 204000 | 0.0107 | 0.5459 | - | 0.9427 |
| 37.5733 | 205000 | 0.0105 | 0.5375 | - | 0.9414 |
| 37.7566 | 206000 | 0.0109 | 0.5566 | - | 0.9421 |
| 37.9399 | 207000 | 0.011 | 0.5601 | - | 0.9428 |
| 38.1232 | 208000 | 0.0095 | 0.5700 | - | 0.9406 |
| 38.3065 | 209000 | 0.0098 | 0.5493 | - | 0.9417 |
| 38.4897 | 210000 | 0.0093 | 0.5867 | - | 0.9372 |
| 38.6730 | 211000 | 0.0095 | 0.6087 | - | 0.9394 |
| 38.8563 | 212000 | 0.0096 | 0.5888 | - | 0.9397 |
| 39.0396 | 213000 | 0.0094 | 0.5806 | - | 0.9380 |
| 39.2229 | 214000 | 0.0087 | 0.5927 | - | 0.9393 |
| 39.4062 | 215000 | 0.0079 | 0.6153 | - | 0.9376 |
| 39.5894 | 216000 | 0.009 | 0.6151 | - | 0.9398 |
| 39.7727 | 217000 | 0.009 | 0.5601 | - | 0.9379 |
| 39.9560 | 218000 | 0.0086 | 0.5845 | - | 0.9409 |
| 40.1393 | 219000 | 0.0078 | 0.5929 | - | 0.9396 |
| 40.3226 | 220000 | 0.0077 | 0.6086 | - | 0.9417 |
| 40.5059 | 221000 | 0.0075 | 0.6053 | - | 0.9418 |
| 40.6891 | 222000 | 0.008 | 0.6078 | - | 0.9394 |
| 40.8724 | 223000 | 0.0084 | 0.5975 | - | 0.9423 |
| 41.0557 | 224000 | 0.0068 | 0.6410 | - | 0.9400 |
| 41.2390 | 225000 | 0.0067 | 0.6183 | - | 0.9409 |
| 41.4223 | 226000 | 0.0067 | 0.6239 | - | 0.9401 |
| 41.6056 | 227000 | 0.0075 | 0.5971 | - | 0.9408 |
| 41.7889 | 228000 | 0.0069 | 0.6458 | - | 0.9396 |
| 41.9721 | 229000 | 0.0073 | 0.6289 | - | 0.9337 |
| 42.1554 | 230000 | 0.0061 | 0.6311 | - | 0.9351 |
| 42.3387 | 231000 | 0.0064 | 0.6371 | - | 0.9254 |
| 42.5220 | 232000 | 0.0067 | 0.6119 | - | 0.9238 |
| 42.7053 | 233000 | 0.0068 | 0.6045 | - | 0.9435 |
| 42.8886 | 234000 | 0.0064 | 0.6246 | - | 0.9403 |
| 43.0718 | 235000 | 0.0066 | 0.6077 | - | 0.9355 |
| 43.2551 | 236000 | 0.0054 | 0.6348 | - | 0.9429 |
| 43.4384 | 237000 | 0.0053 | 0.6606 | - | 0.9414 |
| 43.6217 | 238000 | 0.0054 | 0.6373 | - | 0.9421 |
| 43.8050 | 239000 | 0.006 | 0.6122 | - | 0.9391 |
| 43.9883 | 240000 | 0.0058 | 0.6438 | - | 0.9380 |
| 44.1716 | 241000 | 0.0051 | 0.6474 | - | 0.9392 |
| 44.3548 | 242000 | 0.0049 | 0.6637 | - | 0.9399 |
| 44.5381 | 243000 | 0.005 | 0.6765 | - | 0.9420 |
| 44.7214 | 244000 | 0.0052 | 0.6585 | - | 0.9406 |
| 44.9047 | 245000 | 0.005 | 0.6609 | - | 0.9420 |
| 45.0880 | 246000 | 0.0048 | 0.6725 | - | 0.9417 |
| 45.2713 | 247000 | 0.0044 | 0.6597 | - | 0.9411 |
| 45.4545 | 248000 | 0.0045 | 0.6717 | - | 0.9381 |
| 45.6378 | 249000 | 0.0046 | 0.6689 | - | 0.9361 |
| 45.8211 | 250000 | 0.0046 | 0.6703 | - | 0.9334 |
| 46.0044 | 251000 | 0.0044 | 0.6958 | - | 0.9324 |
| 46.1877 | 252000 | 0.0041 | 0.6884 | - | 0.9380 |
| 46.3710 | 253000 | 0.0041 | 0.6958 | - | 0.9342 |
| 46.5543 | 254000 | 0.004 | 0.6796 | - | 0.9375 |
| 46.7375 | 255000 | 0.0042 | 0.6735 | - | 0.9311 |
| 46.9208 | 256000 | 0.004 | 0.7004 | - | 0.9264 |
| 47.1041 | 257000 | 0.0041 | 0.6798 | - | 0.9303 |
| 47.2874 | 258000 | 0.0036 | 0.7039 | - | 0.9330 |
| 47.4707 | 259000 | 0.0037 | 0.7133 | - | 0.9277 |
| 47.6540 | 260000 | 0.0033 | 0.7200 | - | 0.9250 |
| 47.8372 | 261000 | 0.0038 | 0.7204 | - | 0.9292 |
| 48.0205 | 262000 | 0.0034 | 0.7214 | - | 0.9336 |
| 48.2038 | 263000 | 0.0037 | 0.7077 | - | 0.9313 |
| 48.3871 | 264000 | 0.0033 | 0.7218 | - | 0.9289 |
| 48.5704 | 265000 | 0.0033 | 0.7258 | - | 0.9328 |
| 48.7537 | 266000 | 0.0034 | 0.7215 | - | 0.9346 |
| 48.9370 | 267000 | 0.0031 | 0.7300 | - | 0.9347 |
| 49.1202 | 268000 | 0.0033 | 0.7242 | - | 0.9350 |
| 49.3035 | 269000 | 0.0028 | 0.7320 | - | 0.9345 |
| 49.4868 | 270000 | 0.003 | 0.7397 | - | 0.9341 |
| 49.6701 | 271000 | 0.0029 | 0.7410 | - | 0.9342 |
| 49.8534 | 272000 | 0.0029 | 0.7426 | - | 0.9345 |
* The bold row denotes the saved checkpoint.
</details>
### Framework Versions
- Python: 3.12.3
- Sentence Transformers: 5.1.0
- Transformers: 4.55.0
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.0
- Datasets: 4.0.0
- Tokenizers: 0.21.4
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->