File size: 5,700 Bytes

---
language: en
library_name: sentence-transformers
license: mit
pipeline_tag: sentence-similarity
tags:
- cross-encoder
- regression
- trail-rag
- pathfinder-rag
- msmarco
- passage-ranking
- sentence-transformers
model-index:
- name: trailrag-cross-encoder-msmarco-enhanced
  results:
  - task:
      type: text-ranking
    dataset:
      name: MS MARCO
      type: msmarco
    metrics:
    - type: mse
      value: 0.0423588519082496
    - type: mae
      value: 0.1121706619454281
    - type: rmse
      value: 0.2058126621669562
    - type: r2_score
      value: 0.7490766636371498
    - type: pearson_correlation
      value: 0.9093360796297332
    - type: spearman_correlation
      value: 0.8886928996060736
---

# TrailRAG Cross-Encoder: MS MARCO Enhanced

This is a fine-tuned cross-encoder model specifically optimized for **Passage Ranking** tasks, trained as part of the PathfinderRAG research project.

## Model Details

- **Model Type**: Cross-Encoder for Regression (continuous similarity scores)
- **Base Model**: `cross-encoder/ms-marco-MiniLM-L-6-v2`
- **Training Dataset**: MS MARCO (Large-scale passage ranking dataset from Microsoft)
- **Task**: Passage Ranking
- **Library**: sentence-transformers
- **License**: MIT

## Performance Metrics

### Final Regression Metrics

| Metric | Value | Description |
|--------|-------|-------------|
| **MSE** | **0.042359** | Mean Squared Error (lower is better) |
| **MAE** | **0.112171** | Mean Absolute Error (lower is better) |
| **RMSE** | **0.205813** | Root Mean Squared Error (lower is better) |
| **R² Score** | **0.749077** | Coefficient of determination (higher is better) |
| **Pearson Correlation** | **0.909336** | Linear correlation (higher is better) |
| **Spearman Correlation** | **0.888693** | Rank correlation (higher is better) |

### Training Details

- **Training Duration**: 21 minutes
- **Epochs**: 6
- **Early Stopping**: No
- **Best Correlation Score**: 0.944649
- **Final MSE**: 0.042359

### Training Configuration

- **Batch Size**: 20
- **Learning Rate**: 3e-05
- **Max Epochs**: 6
- **Weight Decay**: 0.01
- **Warmup Steps**: 100

## Usage

This model can be used with the sentence-transformers library for computing semantic similarity scores between query-document pairs.

### Installation

```bash
pip install sentence-transformers
```

### Basic Usage

```python
from sentence_transformers import CrossEncoder

# Load the model
model = CrossEncoder('OloriBern/trailrag-cross-encoder-msmarco-enhanced')

# Example usage
pairs = [
    ['What is artificial intelligence?', 'AI is a field of computer science focused on creating intelligent machines.'],
    ['What is artificial intelligence?', 'Paris is the capital of France.']
]

# Get similarity scores (continuous values, not binary)
scores = model.predict(pairs)
print(scores)  # Higher scores indicate better semantic match
```

### Advanced Usage in PathfinderRAG

```python
from sentence_transformers import CrossEncoder

# Initialize for PathfinderRAG exploration
cross_encoder = CrossEncoder('OloriBern/trailrag-cross-encoder-msmarco-enhanced')

def score_query_document_pair(query: str, document: str) -> float:
    """Score a query-document pair for relevance."""
    score = cross_encoder.predict([[query, document]])[0]
    return float(score)

# Use in document exploration
query = "Your research query"
documents = ["Document 1 text", "Document 2 text", ...]

# Score all pairs
scores = cross_encoder.predict([[query, doc] for doc in documents])
ranked_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
```

## Training Process

This model was trained using **regression metrics** (not classification) to predict continuous similarity scores in the range [0, 1]. The training process focused on:

1. **Data Quality**: Used authentic MS MARCO examples with careful contamination filtering
2. **Regression Approach**: Avoided binary classification, maintaining continuous label distribution
3. **Correlation Optimization**: Maximized Spearman correlation for effective ranking
4. **Scientific Rigor**: All metrics derived from real training runs without simulation

### Why Regression Over Classification?

Cross-encoders for information retrieval should predict **continuous similarity scores**, not binary classifications. This approach:

- Preserves fine-grained similarity distinctions
- Enables better ranking and document selection
- Provides more informative scores for downstream applications
- Aligns with the mathematical foundation of information retrieval

## Dataset

**MS MARCO**: Large-scale passage ranking dataset from Microsoft

- **Task Type**: Passage Ranking
- **Training Examples**: 1,000 high-quality pairs
- **Validation Split**: 20% (200 examples)
- **Quality Threshold**: ≥0.70 (authentic TrailRAG metrics)
- **Contamination**: Zero overlap between splits

## Limitations

- Optimized specifically for passage ranking tasks
- Performance may vary on out-of-domain data
- Requires sentence-transformers library for inference
- CPU-based training (GPU optimization available for future versions)

## Citation

```bibtex
@misc{trailrag-cross-encoder-msmarco,
  title = {TrailRAG Cross-Encoder: MS MARCO Enhanced},
  author = {PathfinderRAG Team},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/OloriBern/trailrag-cross-encoder-msmarco-enhanced}
}
```

## Model Card Contact

For questions about this model, please open an issue in the [PathfinderRAG repository](https://github.com/your-org/trail-rag-1) or contact the development team.

---

*This model card was automatically generated using the TrailRAG model card generator with authentic training metrics.*