CrossEncoder based on microsoft/MiniLM-L12-H384-uncased
This is a Cross Encoder model finetuned from microsoft/MiniLM-L12-H384-uncased using the sentence-transformers library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: microsoft/MiniLM-L12-H384-uncased
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("sentence_transformers_model_id")
# Get scores for pairs of texts
pairs = [
['enrollment statistics at southern arkansas university', 'The University of Southern Malawi also known as the Malawi University of Science and Technology(MUST) [edit]. The Malawi University of Science and Technology was established on 17th December 2012 by the Malawi University of Science and Technology Act No. 31 of 2012 as the fourth Public University in Malawi.'],
['burgos is in what province spain', 'The province of Burgos is a province of northern Spain, in the northeastern part of the autonomous community of Castile and Leon. León it is bordered by the provinces Of, Palencia, Cantabria, Ã\x81lava, Alava álava, La, Rioja, soria Segovia. And valladolid its capital is the City. of burgoshe province of Burgos is divided into 371 municipalities, being the Spanish province with the highest number, although many of them have fewer than 100 inhabitants.'],
['most important customer service skills', 'Customer Service Skill #1: Empathy. Empathy gets thrown around a lot in support training, and for good reason: it might be the single most important customer service skill to develop. To help your customers be happy and successful, itâ\x80\x99s important to understand what happiness and success mean to them.'],
['what happens if we eat too many carbohydrates', 'What Happens If You Eat Too Many Carbs? We all know the feeling you get after eating a large bowl of pasta. Your stomach swells up and you feel like you just gained 10 pounds. Surprisingly carbohydrates are a very important fuel source for your body. Without them it would be hard to have any energy throughout the day. Even though there are risks to consuming no carbs at all, there are also risks to consuming too much! See the article below where we talk about what could happen if you eat too many carbs. You Will Gain Body Fat Sorry to say this but â\x80\x9cyesâ\x80\x9d if you consume too many carbs than you will gain body fat. This isnâ\x80\x99t all that bad though when it comes to building muscle that is. You need to be eating lots of calories throughout the day in order to spark muscle growth. Carbohydrates just happen to have a lot of calories in them.'],
['what county is wharton nj in', 'Sponsored Topics. Wharton is a Borough in Morris County, New Jersey, United States. As of the 2000 United States Census, the borough population was 6,298.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'enrollment statistics at southern arkansas university',
[
'The University of Southern Malawi also known as the Malawi University of Science and Technology(MUST) [edit]. The Malawi University of Science and Technology was established on 17th December 2012 by the Malawi University of Science and Technology Act No. 31 of 2012 as the fourth Public University in Malawi.',
'The province of Burgos is a province of northern Spain, in the northeastern part of the autonomous community of Castile and Leon. León it is bordered by the provinces Of, Palencia, Cantabria, Ã\x81lava, Alava álava, La, Rioja, soria Segovia. And valladolid its capital is the City. of burgoshe province of Burgos is divided into 371 municipalities, being the Spanish province with the highest number, although many of them have fewer than 100 inhabitants.',
'Customer Service Skill #1: Empathy. Empathy gets thrown around a lot in support training, and for good reason: it might be the single most important customer service skill to develop. To help your customers be happy and successful, itâ\x80\x99s important to understand what happiness and success mean to them.',
'What Happens If You Eat Too Many Carbs? We all know the feeling you get after eating a large bowl of pasta. Your stomach swells up and you feel like you just gained 10 pounds. Surprisingly carbohydrates are a very important fuel source for your body. Without them it would be hard to have any energy throughout the day. Even though there are risks to consuming no carbs at all, there are also risks to consuming too much! See the article below where we talk about what could happen if you eat too many carbs. You Will Gain Body Fat Sorry to say this but â\x80\x9cyesâ\x80\x9d if you consume too many carbs than you will gain body fat. This isnâ\x80\x99t all that bad though when it comes to building muscle that is. You need to be eating lots of calories throughout the day in order to spark muscle growth. Carbohydrates just happen to have a lot of calories in them.',
'Sponsored Topics. Wharton is a Borough in Morris County, New Jersey, United States. As of the 2000 United States Census, the borough population was 6,298.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
train-eval
,NanoMSMARCO
,NanoNFCorpus
andNanoNQ
- Evaluated with
CERerankingEvaluator
Metric | train-eval | NanoMSMARCO | NanoNFCorpus | NanoNQ |
---|---|---|---|---|
map | 0.6582 | 0.6058 (+0.1162) | 0.3384 (+0.0680) | 0.6984 (+0.2778) |
mrr@10 | 0.6556 | 0.5982 (+0.1207) | 0.5367 (+0.0368) | 0.7111 (+0.2844) |
ndcg@10 | 0.7121 | 0.6699 (+0.1294) | 0.3760 (+0.0510) | 0.7469 (+0.2462) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_mean
- Evaluated with
CENanoBEIREvaluator
Metric | Value |
---|---|
map | 0.5476 (+0.1540) |
mrr@10 | 0.6153 (+0.1473) |
ndcg@10 | 0.5976 (+0.1422) |
Training Details
Training Dataset
Unnamed Dataset
- Size: 2,000,000 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string int details - min: 10 characters
- mean: 33.9 characters
- max: 104 characters
- min: 64 characters
- mean: 343.08 characters
- max: 991 characters
- 0: ~81.00%
- 1: ~19.00%
- Samples:
sentence_0 sentence_1 label enrollment statistics at southern arkansas university
The University of Southern Malawi also known as the Malawi University of Science and Technology(MUST) [edit]. The Malawi University of Science and Technology was established on 17th December 2012 by the Malawi University of Science and Technology Act No. 31 of 2012 as the fourth Public University in Malawi.
0
burgos is in what province spain
The province of Burgos is a province of northern Spain, in the northeastern part of the autonomous community of Castile and Leon. León it is bordered by the provinces Of, Palencia, Cantabria, Ãlava, Alava álava, La, Rioja, soria Segovia. And valladolid its capital is the City. of burgoshe province of Burgos is divided into 371 municipalities, being the Spanish province with the highest number, although many of them have fewer than 100 inhabitants.
1
most important customer service skills
Customer Service Skill #1: Empathy. Empathy gets thrown around a lot in support training, and for good reason: it might be the single most important customer service skill to develop. To help your customers be happy and successful, itâs important to understand what happiness and success mean to them.
1
- Loss:
FitMixinLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64num_train_epochs
: 1fp16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | train-eval_ndcg@10 | NanoMSMARCO_ndcg@10 | NanoNFCorpus_ndcg@10 | NanoNQ_ndcg@10 | NanoBEIR_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | 0.0488 | 0.0971 (-0.4433) | 0.2449 (-0.0802) | 0.0508 (-0.4498) | 0.1310 (-0.3244) |
0.016 | 500 | 1.1004 | - | - | - | - | - |
0.032 | 1000 | 0.7746 | - | - | - | - | - |
0.048 | 1500 | 0.543 | - | - | - | - | - |
0.064 | 2000 | 0.4508 | - | - | - | - | - |
0.08 | 2500 | 0.4112 | - | - | - | - | - |
0.096 | 3000 | 0.3949 | - | - | - | - | - |
0.112 | 3500 | 0.3793 | - | - | - | - | - |
0.128 | 4000 | 0.3584 | - | - | - | - | - |
0.144 | 4500 | 0.3725 | - | - | - | - | - |
0.16 | 5000 | 0.358 | 0.6634 | 0.6343 (+0.0939) | 0.3986 (+0.0735) | 0.7085 (+0.2078) | 0.5805 (+0.1251) |
0.176 | 5500 | 0.3442 | - | - | - | - | - |
0.192 | 6000 | 0.3355 | - | - | - | - | - |
0.208 | 6500 | 0.3423 | - | - | - | - | - |
0.224 | 7000 | 0.3253 | - | - | - | - | - |
0.24 | 7500 | 0.3256 | - | - | - | - | - |
0.256 | 8000 | 0.3231 | - | - | - | - | - |
0.272 | 8500 | 0.3218 | - | - | - | - | - |
0.288 | 9000 | 0.3119 | - | - | - | - | - |
0.304 | 9500 | 0.3056 | - | - | - | - | - |
0.32 | 10000 | 0.3125 | 0.6861 | 0.6423 (+0.1019) | 0.4197 (+0.0947) | 0.7333 (+0.2327) | 0.5985 (+0.1431) |
0.336 | 10500 | 0.3 | - | - | - | - | - |
0.352 | 11000 | 0.305 | - | - | - | - | - |
0.368 | 11500 | 0.3088 | - | - | - | - | - |
0.384 | 12000 | 0.2963 | - | - | - | - | - |
0.4 | 12500 | 0.3068 | - | - | - | - | - |
0.416 | 13000 | 0.299 | - | - | - | - | - |
0.432 | 13500 | 0.2962 | - | - | - | - | - |
0.448 | 14000 | 0.2942 | - | - | - | - | - |
0.464 | 14500 | 0.2969 | - | - | - | - | - |
0.48 | 15000 | 0.2956 | 0.6964 | 0.6397 (+0.0993) | 0.3773 (+0.0523) | 0.7140 (+0.2134) | 0.5770 (+0.1216) |
0.496 | 15500 | 0.2928 | - | - | - | - | - |
0.512 | 16000 | 0.2829 | - | - | - | - | - |
0.528 | 16500 | 0.2794 | - | - | - | - | - |
0.544 | 17000 | 0.2818 | - | - | - | - | - |
0.56 | 17500 | 0.2843 | - | - | - | - | - |
0.576 | 18000 | 0.2858 | - | - | - | - | - |
0.592 | 18500 | 0.2801 | - | - | - | - | - |
0.608 | 19000 | 0.2902 | - | - | - | - | - |
0.624 | 19500 | 0.2768 | - | - | - | - | - |
0.64 | 20000 | 0.2768 | 0.6963 | 0.6456 (+0.1052) | 0.3820 (+0.0570) | 0.7230 (+0.2224) | 0.5835 (+0.1282) |
0.656 | 20500 | 0.2744 | - | - | - | - | - |
0.672 | 21000 | 0.2753 | - | - | - | - | - |
0.688 | 21500 | 0.2632 | - | - | - | - | - |
0.704 | 22000 | 0.2818 | - | - | - | - | - |
0.72 | 22500 | 0.2668 | - | - | - | - | - |
0.736 | 23000 | 0.2673 | - | - | - | - | - |
0.752 | 23500 | 0.2663 | - | - | - | - | - |
0.768 | 24000 | 0.2612 | - | - | - | - | - |
0.784 | 24500 | 0.2655 | - | - | - | - | - |
0.8 | 25000 | 0.2592 | 0.7070 | 0.6614 (+0.1210) | 0.3803 (+0.0552) | 0.7482 (+0.2476) | 0.5966 (+0.1412) |
0.816 | 25500 | 0.2661 | - | - | - | - | - |
0.832 | 26000 | 0.2568 | - | - | - | - | - |
0.848 | 26500 | 0.2651 | - | - | - | - | - |
0.864 | 27000 | 0.2577 | - | - | - | - | - |
0.88 | 27500 | 0.2579 | - | - | - | - | - |
0.896 | 28000 | 0.2552 | - | - | - | - | - |
0.912 | 28500 | 0.2531 | - | - | - | - | - |
0.928 | 29000 | 0.255 | - | - | - | - | - |
0.944 | 29500 | 0.2565 | - | - | - | - | - |
0.96 | 30000 | 0.2534 | 0.7150 | 0.6647 (+0.1243) | 0.3745 (+0.0495) | 0.7479 (+0.2472) | 0.5957 (+0.1403) |
0.976 | 30500 | 0.2508 | - | - | - | - | - |
0.992 | 31000 | 0.2459 | - | - | - | - | - |
1.0 | 31250 | - | 0.7121 | 0.6699 (+0.1294) | 0.3760 (+0.0510) | 0.7469 (+0.2462) | 0.5976 (+0.1422) |
Environmental Impact
Carbon emissions were measured using CodeCarbon.
- Energy Consumed: 0.457 kWh
- Carbon Emitted: 0.178 kg of CO2
- Hours Used: 1.209 hours
Training Hardware
- On Cloud: No
- GPU Model: 1 x NVIDIA GeForce RTX 3090
- CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
- RAM Size: 31.78 GB
Framework Versions
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.48.3
- PyTorch: 2.5.0+cu121
- Accelerate: 1.3.0
- Datasets: 2.20.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 15
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support text-classification models for sentence-transformers library.
Model tree for tomaarsen/reranker-MiniLM-L12-msmarco-scratch-pos_weight-4
Base model
microsoft/MiniLM-L12-H384-uncasedEvaluation results
- Map on train evalself-reported0.658
- Mrr@10 on train evalself-reported0.656
- Ndcg@10 on train evalself-reported0.712