YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 English to Spanish Translation AI Model

This repository contains a Transformer-based AI model fine-tuned for English to Spanish text translation. The model has been trained, quantized (FP16), and tested for quality and scoring. It delivers high-accuracy translations and is suitable for real-world use cases such as educational tools, real-time communication, and travel assistants.


🚀 Features

  • 🔁 Language Pair: English → Spanish
  • 🔧 Model: Helsinki-NLP/opus-mt-en-es
  • 🧪 Quantized: FP16 for efficient inference
  • 🎯 High Accuracy: Scored well on validation sets
  • CUDA Enabled: Fast training and inference

📊 Dataset Used

Hugging Face Dataset: OscarNav/spa-eng

  • Source: OscarNav
  • Language Pair: en-es
  • Dataset Size: ~107K sentence pairs
from datasets import load_dataset

dataset = load_dataset("OscarNav/spa-eng", lang1="en", lang2="es")

🛠️ Model Training & Fine-Tuning

  • Pretrained Base Model: Helsinki-NLP/opus-mt-en-es

  • Tokenizer: AutoTokenizer from Hugging Face Transformers

  • Training Environment: Kaggle Notebook with CUDA GPU

  • Batch Size: 16

  • Epochs: 3–5 (based on early stopping)

  • Optimizer: AdamW

  • Loss Function: CrossEntropyLoss

🧪 Quantization (FP16)

Quantized the model for reduced memory usage and faster inference without compromising translation quality.

model = model.half() 
model.save_pretrained("quantized_model_fp16")

✅ Scoring

BLEU Score: ~34+

  • Evaluation Metric: sacrebleu on validation set

  • Inference Accuracy: Verified using real-world sample sentences

Downloads last month
2
Safetensors
Model size
77.5M params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support