Test Eval Results

Overall Test Set Accuracy: 0.9280

Per Class Test Set Accuracy:

  • sadness: Accuracy = 0.9656 (581 samples)
  • joy: Accuracy = 0.9511 (695 samples)
  • love: Accuracy = 0.7862 (159 samples)
  • anger: Accuracy = 0.9345 (275 samples)
  • fear: Accuracy = 0.8839 (224 samples)
  • surprise: Accuracy = 0.8182 (66 samples)

Model Description

This model is a fine-tuned version of distilroberta-base specifically optimized for emotion classification in text. The model can identify six distinct emotions:

  • Sadness: expressions of sorrow, disappointment, or depression
  • Joy: expressions of happiness, excitement, or contentment
  • Love: expressions of affection, care, or romantic feelings
  • Anger: expressions of frustration, rage, or annoyance
  • Fear: expressions of worry, anxiety, or terror
  • Surprise: expressions of astonishment, shock, or unexpected reactions

The model architecture is based on DistilRoBERTa, a distilled version of RoBERTa, making it more efficient while maintaining good performance. It uses attention mechanisms to understand context and outputs probability scores for each emotion category.

Key Features:

  • Based on DistilRoBERTa architecture
  • Trained on diverse emotional expressions
  • Outputs probability distributions across 6 emotion categories
  • Optimized for real-time classification
  • Handles various text lengths and formats

Dataset Overview

The Emotion Dataset (dair-ai/emotion) is a collection of English Twitter messages labeled with six basic emotions. It's an adapted version of the dataset presented in the paper "Exploring Transfer Learning with T5: the Text-To-Text Transfer Transformer".

Intended Uses & Limitations

Intended Uses

  1. Content Analysis:

    • Social media sentiment monitoring
    • Customer feedback emotional analysis
    • User experience feedback classification
    • Community content moderation
  2. Research Applications:

    • Psychological studies
    • Social behavior analysis
    • Communication research
    • Emotional pattern recognition
  3. Business Applications:

    • Customer service response prioritization
    • Brand sentiment analysis
    • User satisfaction monitoring
    • Marketing response analysis

Limitations

  1. Language Limitations:

    • Primarily optimized for English text
    • May not understand multilingual expressions
    • Limited understanding of slang or colloquialisms
  2. Technical Limitations:

    • Maximum input length of 512 tokens
    • May struggle with heavy sarcasm or irony
    • Cannot handle images or multimodal content
    • Single-label classification (one emotion per text)
  3. Performance Considerations:

    • Emotions can be subjective and context-dependent
    • May show biases present in training data
    • Performance varies with text length and complexity
    • May not capture subtle emotional nuances

Ethical Considerations

  • Should not be used for critical decision-making without human oversight
  • May perpetuate biases present in training data
  • Privacy considerations when analyzing personal communications
  • Should not be used for surveillance or without user consent

Training and Evaluation Data

Training Data

The model was trained on the dair-ai/emotion dataset, which includes:

  • Total samples: 20,000
  • Training samples: 16,000 examples
  • Validation samples: 2000 examples
  • Test samples: 2000 examples
  • Data source: Twitter
  • Data balance: Distribution across 6 emotion categories

Data Preprocessing

  1. Text cleaning and normalization
  2. Tokenization using DistilRoBERTa tokenizer
  3. Padding and truncation to 512 tokens
  4. Label encoding for 6 emotion categories

Training Process

  • Architecture: DistilRoBERTa-base
  • Hardware Used: Google Colab GPU

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy Per Class
0.7921 1.0 500 0.2453 0.9205 {'sadness': 0.9618181818181818, 'joy': 0.953125, 'love': 0.8202247191011236, 'anger': 0.8763636363636363, 'fear': 0.8254716981132075, 'surprise': 0.9753086419753086}
0.2062 2.0 1000 0.1608 0.938 {'sadness': 0.9563636363636364, 'joy': 0.9829545454545454, 'love': 0.7415730337078652, 'anger': 0.9345454545454546, 'fear': 0.9433962264150944, 'surprise': 0.8518518518518519}
0.1342 3.0 1500 0.1418 0.9335 {'sadness': 0.9581818181818181, 'joy': 0.9758522727272727, 'love': 0.7808988764044944, 'anger': 0.9127272727272727, 'fear': 0.9056603773584906, 'surprise': 0.8765432098765432}
0.1001 4.0 2000 0.1336 0.941 {'sadness': 0.9636363636363636, 'joy': 0.96875, 'love': 0.8370786516853933, 'anger': 0.9454545454545454, 'fear': 0.9009433962264151, 'surprise': 0.8641975308641975}
0.0751 5.0 2500 0.1445 0.942 {'sadness': 0.9763636363636363, 'joy': 0.9573863636363636, 'love': 0.8595505617977528, 'anger': 0.9418181818181818, 'fear': 0.910377358490566, 'surprise': 0.8395061728395061}

Usage Examples

Basic Usage

from transformers import pipeline

# Initialize the classifier
classifier = pipeline("text-classification", model="mananshah296/roberta-emotion")

# Example text
text = "I'm so happy today!"

# Make prediction
result = classifier(text)
print(f"Text: {text}")  # This will show the text
print(f"Emotion: {result[0]['label']}")  # This will show the emotion
print(f"Confidence: {result[0]['score']:.2%}")  # This will show the score percentage

Advanced Usage with Custom Processing

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from datetime import datetime

def setup_emotion_classifier(model_name):
    # Load tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    
    # Move to available device
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model = model.to(device)
    print(f"Using device: {device}")
    
    # Print session info
    current_time = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
    print(f"Current Date and Time (UTC): {current_time}")
    
    return tokenizer, model, device

def predict_emotions(text, tokenizer, model, device):
    # Prepare input
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Get prediction
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.nn.functional.softmax(outputs.logits, dim=1)[0]
    
    # Convert to readable format
    predictions = []
    for i, prob in enumerate(probs):
        emotion = model.config.id2label[i]
        score = prob.item()
        predictions.append({
            "emotion": emotion,
            "confidence": score
        })
    
    # Sort by confidence
    return sorted(predictions, key=lambda x: x["confidence"], reverse=True)

# Setup
model_name = "mananshah296/roberta-emotion"  # replace with your model name
tokenizer, model, device = setup_emotion_classifier(model_name)

# Example usage
text = "This is absolutely amazing news!"
predictions = predict_emotions(text, tokenizer, model, device)

# Print results
print(f"Analyzing: {text}")
print("Emotions detected:")
for pred in predictions:
    print(f"{pred['emotion']:<10}: {pred['confidence']:.2%}")

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.0
  • Tokenizers 0.21.0
Downloads last month
186
Safetensors
Model size
82.1M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for mananshah296/roberta-emotion

Finetuned
(592)
this model