Asymmetric Inference-free SPLADE Sparse Encoder

This is a Asymmetric Inference-free SPLADE Sparse Encoder model finetuned from opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

Model Sources

Full Model Architecture

SparseEncoder(
  (0): Router(
    (sub_modules): ModuleDict(
      (query): Sequential(
        (0): SparseStaticEmbedding({'frozen': False}, dim=30522, tokenizer=DistilBertTokenizerFast)
      )
      (document): Sequential(
        (0): MLMTransformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewForMaskedLM'})
        (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'log1p_relu', 'word_embedding_dimension': 30522})
      )
    )
  )
)

Metrics

{
    "NDCG": {
        "NDCG@2": 0.90484,
        "NDCG@10": 0.91822,
        "NDCG@20": 0.9204,
        "NDCG@100": 0.92605
    },
    "MAP": {
        "MAP@2": 0.90125,
        "MAP@10": 0.91146,
        "MAP@20": 0.91216,
        "MAP@100": 0.91316
    },
    "Recall": {
        "Recall@2": 0.9045,
        "Recall@10": 0.931,
        "Recall@20": 0.938,
        "Recall@100": 0.963
    },
    "Precision": {
        "P@2": 0.9045,
        "P@10": 0.1862,
        "P@20": 0.0938,
        "P@100": 0.01926
    }
}

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("Frinkleko/opensearch-project_opensearch-neural-sparse-encoding-doc-v3-gte-limit-samples-1700")
# Run inference
queries = [
    "Who likes Sonatas?",
]
documents = [
    ' Kamron Rose likes Landscaping, Spinning, Rugs, Model Building, Figure Skating, Extension Cords, Doctors, Sonatas, Owls.',
    ' Ege Kennedy likes Excitement, Tap Dancing, the Houston Astros, Agave Nectar, Cobras.',
    ' Codey Luna likes Bodyboarding, Keys, Tarantulas, Bonsai Trees, Balsamic Vinegar, Kale, Ticket to Ride, Ricotta, Tuning Forks, Silver, Sperm Whales, The Roaring Twenties, Manchego.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[16.2629,  6.6099,  5.9649]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,700 training samples
  • Columns: query and document
  • Approximate statistics based on the first 1000 samples:
    query document
    type string string
    details
    • min: 6 tokens
    • mean: 7.37 tokens
    • max: 12 tokens
    • min: 19 tokens
    • mean: 44.42 tokens
    • max: 67 tokens
  • Samples:
    query document
    Who likes Sunflowers? Rasmus Logan likes Dark Chocolate, Documentary Series, Washing Machines, Softball, Sunflowers, Gregorian chants, Za'atar, Abacuses, Dolphins, Root Beer Floats, Cumin, Coconut Flour.
    Who likes Shaved Ice? Abdulkarem Boyer likes Stag Beetles, Acacia Trees, Olives, Landscape Photography, Neoclassicism, Guinea Pigs, Mentoring, Parsley, Chemistry, Vases, Shaved Ice.
    Who likes Rock Balancing? Tanay Melton likes Poetry Slams, Sperm Whales, Tonic Water, Bat Flowers, Rock Balancing.
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)",
        "document_regularizer_weight": 3e-05,
        "query_regularizer_weight": 5e-05
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 85
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates
  • router_mapping: {'query': 'query', 'document': 'document'}

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 85
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {'query': 'query', 'document': 'document'}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.12.9
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

LIMIT

@misc{weller2025theoreticallimit,
      title={On the Theoretical Limitations of Embedding-Based Retrieval}, 
      author={Orion Weller and Michael Boratko and Iftekhar Naim and Jinhyuk Lee},
      year={2025},
      eprint={2508.21038},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2508.21038}, 
}

OpenSearch Models

@inproceedings{Shen_2025, series={SIGIR ’25},
   title={Exploring $\ell_0$ parsification for Inference-free Sparse Retrievers},
   url={http://dx.doi.org/10.1145/3726302.3730192},
   DOI={10.1145/3726302.3730192},
   booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
   publisher={ACM},
   author={Shen, Xinjie and Geng, Zhichao and Yang, Yang},
   year={2025},
   month=jul, pages={2572–2576},
   collection={SIGIR ’25} 
}
@misc{geng2025competitivesearchrelevanceinferencefree,
      title={Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers}, 
      author={Zhichao Geng and Yiwen Wang and Dongyu Ru and Yang Yang},
      year={2025},
      eprint={2411.04403},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2411.04403}, 
}

SpladeLoss

@misc{formal2022distillationhardnegativesampling,
      title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
      author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
      year={2022},
      eprint={2205.04733},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2205.04733},
}

SparseMultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Frinkleko/opensearch-doc-v3-gte-finetune-limit-samples-1700