🧠 English to Spanish Translation AI Model
This repository contains a Transformer-based AI model fine-tuned for English to Spanish text translation. The model has been trained, quantized (FP16), and tested for quality and scoring. It delivers high-accuracy translations and is suitable for real-world use cases such as educational tools, real-time communication, and travel assistants.
🚀 Features
- 🔁 Language Pair: English → Spanish
- 🔧 Model: Helsinki-NLP/opus-mt-en-es
- 🧪 Quantized: FP16 for efficient inference
- 🎯 High Accuracy: Scored well on validation sets
- ⚡ CUDA Enabled: Fast training and inference
📊 Dataset Used
Hugging Face Dataset: OscarNav/spa-eng
- Source: OscarNav
- Language Pair:
en-es
- Dataset Size: ~107K sentence pairs
from datasets import load_dataset
dataset = load_dataset("OscarNav/spa-eng", lang1="en", lang2="es")
🛠️ Model Training & Fine-Tuning
Pretrained Base Model: Helsinki-NLP/opus-mt-en-es
Tokenizer: AutoTokenizer from Hugging Face Transformers
Training Environment: Kaggle Notebook with CUDA GPU
Batch Size: 16
Epochs: 3–5 (based on early stopping)
Optimizer: AdamW
Loss Function: CrossEntropyLoss
🧪 Quantization (FP16)
Quantized the model for reduced memory usage and faster inference without compromising translation quality.
model = model.half()
model.save_pretrained("quantized_model_fp16")
✅ Scoring
BLEU Score: ~34+
Evaluation Metric: sacrebleu on validation set
Inference Accuracy: Verified using real-world sample sentences
- Downloads last month
- 2