Finance Entity Extractor (FinEE) v1.0

PyPI Tests License

Open In Colab

Production-grade Finance NER for Indian Banks
Hybrid Regex + Phi-3 LLM • 94.5% accuracy • <1ms latency


🔥 Hybrid Architecture

Runs 100% offline using Regex by default. Optional 3.8B LLM auto-downloads only for complex edge cases.

Mode Latency Accuracy Model Download
Regex (Default) <1ms 87% ❌ None
Regex + LLM ~50ms 94.5% ✅ 7GB (one-time)

⚡ Install in 10 Seconds

pip install finee
from finee import extract

r = extract("Rs.2500 debited from A/c XX3545 to swiggy@ybl on 28-12-2025")

print(r.amount)    # 2500.0
print(r.merchant)  # "Swiggy"
print(r.category)  # "food"

Try it now: Open In Colab


🧠 Enable LLM Mode (For Edge Cases)

from finee import FinEE
from finee.schema import ExtractionConfig

# Downloads 7GB model once, then runs locally
extractor = FinEE(ExtractionConfig(use_llm=True))
result = extractor.extract("Your complex bank message...")

Supported Backends:

  • Apple Silicon → MLX (fastest)
  • NVIDIA GPU → PyTorch/CUDA
  • CPU → llama.cpp (GGUF)

📋 Output Schema Contract

Every extraction returns this guaranteed JSON structure:

{
  "amount": 2500.0,           // float - Always numeric
  "currency": "INR",          // string - ISO 4217
  "type": "debit",            // "debit" | "credit"
  "account": "3545",          // string - Last 4 digits
  "date": "28-12-2025",       // string - DD-MM-YYYY
  "reference": "534567891234",// string - UPI/NEFT ref
  "merchant": "Swiggy",       // string - Normalized name
  "category": "food",         // string - food|shopping|transport|...
  "confidence": 0.95          // float - 0.0 to 1.0
}

🔬 Verify Accuracy Yourself

git clone https://github.com/Ranjitbehera0034/Finance-Entity-Extractor.git
cd Finance-Entity-Extractor
pip install finee
python benchmark.py --all

💀 Edge Case Handling

Input Result
Rs.500.00debited from A/c1234 (no spaces) ✅ amount=500.0
₹2,500 debited (Unicode) ✅ amount=2500.0
1.5 Lakh credited (Lakhs) ✅ amount=150000.0
Rs.500 debited. Bal: Rs.15,000 (multiple) ✅ amount=500.0

🏦 Supported Banks

Bank Status
HDFC
ICICI
SBI
Axis
Kotak

📊 Benchmark

Metric Value
Field Accuracy 94.5% (with LLM)
Regex-only Accuracy 87.5%
Latency (Regex) <1ms
Throughput 50,000+ msg/sec

🏗️ Architecture

Input Text
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 0: Hash Cache (<1ms if seen before)                    │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: Regex Engine (50+ patterns)                        │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 2: Rule-Based Mapping (200+ VPA → merchant)           │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 3: Phi-3 LLM (Optional - downloads 7GB model)         │
│         Only called for edge cases                         │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
ExtractionResult (Guaranteed Schema)

📁 Repository Structure

Finance-Entity-Extractor/
├── src/finee/              # Core package
├── tests/                  # 88 unit tests
├── examples/demo.ipynb     # 👈 Try in Colab!
├── benchmark.py            # Verify accuracy
├── CHANGELOG.md            # Release history
└── CONTRIBUTING.md         # How to contribute

🤝 Contributing

See CONTRIBUTING.md for:

  • Git Flow branching strategy
  • How to run tests
  • Release process

📄 License

MIT License


Made with ❤️ by Ranjit Behera

PyPIGitHubHugging Face

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ranjit0034/finance-entity-extractor

Quantized
(158)
this model