T5 Question Generation with Answer Masking

This repository contains a T5-base model fine-tuned for generating question-answer pairs from a given context. Leveraging T5’s text-to-text framework and a novel training strategy where the answer is occasionally masked (30% chance), the model is designed to generate both coherent questions and corresponding answers—even when provided with incomplete answer information.

Model Overview

Built with PyTorch Lightning, this implementation adapts the pre-trained T5-base model for the dual task of question generation and answer prediction. By randomly replacing the answer with the [MASK] token during training, the model learns to handle scenarios where the answer is partially or completely missing, thereby improving its robustness and versatility.

Data Processing

Input Construction

Each input sample is formatted as follows:

context: [context] answer: [MASK or answer] </s>
  • Answer Masking: During training, the answer is replaced with the [MASK] token 30% of the time. This forces the model to generate both the question and the answer even when provided with partial input.

Target Construction

Each target sample is formatted as:

question: [question] answer: [answer] </s>

This format ensures that the model generates a question first, followed by the corresponding answer.

Training Details

  • Framework: PyTorch Lightning
  • Base Model: T5-base
  • Optimizer: AdamW with linear learning rate scheduling
  • Batch Size: 8 (training)
  • Maximum Token Length:
    • Input: 512 tokens
    • Target: 64 tokens
  • Number of Training Epochs: 4
  • Answer Masking Probability: 30%

Evaluation Metrics

The model’s performance is evaluated using BLEU scores for both generated questions and answers. The following table summarizes the evaluation metrics on the test set:

Metric Question Answer
BLEU-1 0.3127 0.7243
BLEU-2 0.2073 0.5448
BLEU-3 0.1526 0.4036
BLEU-4 0.1159 0.3127

Note: BLEU scores measure n‑gram overlap between generated outputs and references. While useful, they do not capture every aspect of generation quality.

How to Use

You can easily leverage this model for inference using the Hugging Face Transformers pipeline:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM


model_name = "fares7elsadek/t5-base-finetuned-question-generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

def generate_qa(context, answer="[MASK]", max_length=64):
    """
    Generates a question and answer pair from the provided context.

    Args:
        context (str): The context passage.
        answer (str): The answer text. Use "[MASK]" to prompt the model to predict the answer.
        max_length (int): Maximum length of the generated sequence.
        
    Returns:
        str: The generated question and answer pair.
    """
    input_text = f"context: {context} answer: {answer} </s>"
    inputs = tokenizer([input_text], return_tensors="pt", truncation=True, padding=True)
    outputs = model.generate(
        input_ids=inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_length=max_length
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example inference:
context = "The Eiffel Tower was constructed in 1889 for the World's Fair in Paris."
answer = "The Eiffel Tower"  # Alternatively, use "[MASK]" to have the model predict the answer
print(generate_qa(context, answer))
Downloads last month
79
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for fares7elsadek/t5-base-finetuned-question-generation

Base model

google-t5/t5-base
Finetuned
(484)
this model

Dataset used to train fares7elsadek/t5-base-finetuned-question-generation