aditeyabaral-redis's picture
Add new CrossEncoder model
5ada8ca verified
|
raw
history blame
15.5 kB
metadata
language:
  - en
license: apache-2.0
tags:
  - cross-encoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:1047690
  - loss:BinaryCrossEntropyLoss
base_model: Alibaba-NLP/gte-reranker-modernbert-base
datasets:
  - aditeyabaral-redis/langcache-sentencepairs
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
  - accuracy
  - accuracy_threshold
  - f1
  - f1_threshold
  - precision
  - recall
  - average_precision
model-index:
  - name: Redis fine-tuned CrossEncoder model for semantic caching on LangCache
    results:
      - task:
          type: cross-encoder-classification
          name: Cross Encoder Classification
        dataset:
          name: val
          type: val
        metrics:
          - type: accuracy
            value: 0.77180249851279
            name: Accuracy
          - type: accuracy_threshold
            value: 0.8926752805709839
            name: Accuracy Threshold
          - type: f1
            value: 0.6933947772657449
            name: F1
          - type: f1_threshold
            value: 0.8759380578994751
            name: F1 Threshold
          - type: precision
            value: 0.678796992481203
            name: Precision
          - type: recall
            value: 0.7086342229199372
            name: Recall
          - type: average_precision
            value: 0.7676424589681807
            name: Average Precision
      - task:
          type: cross-encoder-classification
          name: Cross Encoder Classification
        dataset:
          name: test
          type: test
        metrics:
          - type: accuracy
            value: 0.7230292965285952
            name: Accuracy
          - type: accuracy_threshold
            value: 0.9352303147315979
            name: Accuracy Threshold
          - type: f1
            value: 0.7144263194410831
            name: F1
          - type: f1_threshold
            value: 0.9142870903015137
            name: F1 Threshold
          - type: precision
            value: 0.6302559284880577
            name: Precision
          - type: recall
            value: 0.8245437616387337
            name: Recall
          - type: average_precision
            value: 0.6906882331078481
            name: Average Precision

Redis fine-tuned CrossEncoder model for semantic caching on LangCache

This is a Cross Encoder model finetuned from Alibaba-NLP/gte-reranker-modernbert-base on the LangCache Sentence Pairs (all) dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for sentence pair classification.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("aditeyabaral-redis/langcache-reranker-v1")
# Get scores for pairs of texts
pairs = [
    ['The newer Punts are still very much in existence today and race in the same fleets as the older boats .', 'The newer punts are still very much in existence today and run in the same fleets as the older boats .'],
    ['Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .', 'Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .'],
    ['After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall .', 'Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .'],
    ['She married Peter Haygarth on 29 May 1964 in Durban . Her second marriage , to Robin Osborne , took place in 1977 .', 'She married Robin Osborne on May 29 , 1964 in Durban , and her second marriage with Peter Haygarth took place in 1977 .'],
    ['In 2005 she moved to Norway , settled in Geilo and worked as a rafting guide , in 2006 she started mountain biking - races .', 'In 2005 , she moved to Geilo , settling in Norway and worked as a rafting guide . She started mountain bike races in 2006 .'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'The newer Punts are still very much in existence today and race in the same fleets as the older boats .',
    [
        'The newer punts are still very much in existence today and run in the same fleets as the older boats .',
        'Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .',
        'Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall .',
        'She married Robin Osborne on May 29 , 1964 in Durban , and her second marriage with Peter Haygarth took place in 1977 .',
        'In 2005 , she moved to Geilo , settling in Norway and worked as a rafting guide . She started mountain bike races in 2006 .',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Classification

Metric val test
accuracy 0.7718 0.723
accuracy_threshold 0.8927 0.9352
f1 0.6934 0.7144
f1_threshold 0.8759 0.9143
precision 0.6788 0.6303
recall 0.7086 0.8245
average_precision 0.7676 0.6907

Training Details

Training Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 62,021 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 27 characters
    • mean: 112.72 characters
    • max: 197 characters
    • min: 27 characters
    • mean: 112.54 characters
    • max: 198 characters
    • 0: ~50.30%
    • 1: ~49.70%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . 0
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 62,021 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 27 characters
    • mean: 112.72 characters
    • max: 197 characters
    • min: 27 characters
    • mean: 112.54 characters
    • max: 198 characters
    • 0: ~50.30%
    • 1: ~49.70%
  • Samples:
    sentence1 sentence2 label
    The newer Punts are still very much in existence today and race in the same fleets as the older boats . The newer punts are still very much in existence today and run in the same fleets as the older boats . 1
    Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . 0
    After losing his second election , he resigned as opposition leader and was replaced by Geoff Pearsall . Max Bingham resigned as opposition leader after losing his second election , and was replaced by Geoff Pearsall . 1
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Logs

Epoch Step val_average_precision test_average_precision
-1 -1 0.7676 0.6907

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}