EA-HS: East Africa Hate Speech Classifier v3

Multilingual hate speech classifier for East African languages, built for conflict monitoring and peacebuilding applications.

Model Details

  • Base model: Davlan/afro-xlmr-base (Africa-focused XLM-RoBERTa)
  • Fine-tuned on: AfriHate (7 African languages) + HatEval (Arabic) + HateXplain (English)
  • Labels: 0 (not hate), 1 (hate), 2 (offensive)
  • Languages: Swahili, Somali, Amharic, Oromo, Tigrinya, Kinyarwanda, Nigerian Pidgin, Arabic, English

Performance

Version Base Model Accuracy F1
v3 (current) afro-xlmr-base 77.10% 76.87%
v2 xlm-roberta-base 76.18% 75.99%

Usage

from transformers import pipeline
classifier = pipeline('text-classification', model='KSvendsen/EA-HS')
result = classifier('This is a test sentence')

Training

  • 5 epochs, batch size 16, learning rate 2e-5
  • Class-weighted loss + minority upsampling
  • ~95k training samples across 9 languages

Developed by

MERLx / RIKO - AI-augmented conflict monitoring

Downloads last month
135
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train KSvendsen/EA-HS

Evaluation results