BGE base Financial Matryoshka
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Snowflake/snowflake-arctic-embed-m-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Abinaya/snowflake-arctic-embed-financial-matryoshka")
# Run inference
sentences = [
'As of December 31, 2023, Bank of America reported gross derivative assets and liabilities totaling $290.3 billion and $301.2 billion, respectively. After accounting for legally enforceable master netting agreements and cash collateral, these figures were adjusted to $39.3 billion in assets and $43.4 billion in liabilities.',
'What were the total derivative assets and liabilities at Bank of America as of December 31, 2023, after adjusting for master netting agreements and cash collateral?',
'By what percentage did HIV product sales increase in 2023 compared to the previous year?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.7543 | 0.7543 | 0.7557 | 0.7443 | 0.7 |
cosine_accuracy@3 | 0.8614 | 0.8614 | 0.86 | 0.85 | 0.8157 |
cosine_accuracy@5 | 0.8914 | 0.8914 | 0.8943 | 0.8871 | 0.8571 |
cosine_accuracy@10 | 0.9329 | 0.93 | 0.9286 | 0.9143 | 0.9071 |
cosine_precision@1 | 0.7543 | 0.7543 | 0.7557 | 0.7443 | 0.7 |
cosine_precision@3 | 0.2871 | 0.2871 | 0.2867 | 0.2833 | 0.2719 |
cosine_precision@5 | 0.1783 | 0.1783 | 0.1789 | 0.1774 | 0.1714 |
cosine_precision@10 | 0.0933 | 0.093 | 0.0929 | 0.0914 | 0.0907 |
cosine_recall@1 | 0.7543 | 0.7543 | 0.7557 | 0.7443 | 0.7 |
cosine_recall@3 | 0.8614 | 0.8614 | 0.86 | 0.85 | 0.8157 |
cosine_recall@5 | 0.8914 | 0.8914 | 0.8943 | 0.8871 | 0.8571 |
cosine_recall@10 | 0.9329 | 0.93 | 0.9286 | 0.9143 | 0.9071 |
cosine_ndcg@10 | 0.8431 | 0.8409 | 0.8409 | 0.8298 | 0.8023 |
cosine_mrr@10 | 0.8144 | 0.8124 | 0.8129 | 0.8026 | 0.7689 |
cosine_map@100 | 0.8171 | 0.8153 | 0.8158 | 0.8061 | 0.7722 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 6,300 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 6 tokens
- mean: 46.81 tokens
- max: 326 tokens
- min: 9 tokens
- mean: 20.23 tokens
- max: 45 tokens
- Samples:
positive anchor Opioids Related Securities Class Actions and Derivative Litigation: Three derivative complaints and two securities class actions drawing heavily on the allegations of the DOJ complaint have been filed in Delaware naming the Company and various current and former directors and certain current and former officers as defendants. The plaintiffs in the derivative suits (in which the Company is a nominal defendant) allege, among other things, that the defendants breached their fidariety duties in connection with oversight of opioids dispensing and distribution and that the defendants violated Section 14(a) of the Securities Exchange Act of 1934, as amended (the 'Exchange Act()), and are liable for contribution under Section 10(b) of the Exchange Act in connection with the Company's disclosures about opioids.
What kind of claims are involved in the securities and derivative litigation against the Company listed in the document?
Walmart's fintech venture, ONE, provides financial services such as money orders, prepaid access, money transfers, check cashing, bill payment, and certain types of installment lending.
What types of financial services are offered through Walmart's fintech venture, ONE?
Juice and juice concentrate from various fruits, particularly orange juice and orange juice concentrate, are principal raw materials for juice and juice drink products, and milk is the principal raw material for dairy products managed through fairlife, LLC.
What are the primary raw materials for the company's juice and dairy products?
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|
0.8122 | 10 | 1.5521 | - | - | - | - | - |
1.0 | 13 | - | 0.8136 | 0.8108 | 0.8143 | 0.7949 | 0.7552 |
1.5685 | 20 | 0.4812 | - | - | - | - | - |
2.0 | 26 | - | 0.8405 | 0.8388 | 0.8384 | 0.8284 | 0.7990 |
2.3249 | 30 | 0.3585 | - | - | - | - | - |
3.0 | 39 | - | 0.8420 | 0.8409 | 0.8408 | 0.8290 | 0.8009 |
3.0812 | 40 | 0.3101 | - | - | - | - | - |
3.731 | 48 | - | 0.8431 | 0.8409 | 0.8409 | 0.8298 | 0.8023 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.15
- Sentence Transformers: 3.4.0
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.0.1
- Datasets: 3.0.1
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for Abinaya/snowflake-arctic-embed-financial-matryoshka
Base model
Snowflake/snowflake-arctic-embed-m-v1.5Evaluation results
- Cosine Accuracy@1 on dim 768self-reported0.754
- Cosine Accuracy@3 on dim 768self-reported0.861
- Cosine Accuracy@5 on dim 768self-reported0.891
- Cosine Accuracy@10 on dim 768self-reported0.933
- Cosine Precision@1 on dim 768self-reported0.754
- Cosine Precision@3 on dim 768self-reported0.287
- Cosine Precision@5 on dim 768self-reported0.178
- Cosine Precision@10 on dim 768self-reported0.093
- Cosine Recall@1 on dim 768self-reported0.754
- Cosine Recall@3 on dim 768self-reported0.861