T5 Question Generation with Answer Masking
This repository contains a T5-base model fine-tuned for generating question-answer pairs from a given context. Leveraging T5’s text-to-text framework and a novel training strategy where the answer is occasionally masked (30% chance), the model is designed to generate both coherent questions and corresponding answers—even when provided with incomplete answer information.
Model Overview
Built with PyTorch Lightning, this implementation adapts the pre-trained T5-base model for the dual task of question generation and answer prediction. By randomly replacing the answer with the [MASK]
token during training, the model learns to handle scenarios where the answer is partially or completely missing, thereby improving its robustness and versatility.
Data Processing
Input Construction
Each input sample is formatted as follows:
context: [context] answer: [MASK or answer] </s>
- Answer Masking: During training, the answer is replaced with the
[MASK]
token 30% of the time. This forces the model to generate both the question and the answer even when provided with partial input.
Target Construction
Each target sample is formatted as:
question: [question] answer: [answer] </s>
This format ensures that the model generates a question first, followed by the corresponding answer.
Training Details
- Framework: PyTorch Lightning
- Base Model: T5-base
- Optimizer: AdamW with linear learning rate scheduling
- Batch Size: 8 (training)
- Maximum Token Length:
- Input: 512 tokens
- Target: 64 tokens
- Number of Training Epochs: 4
- Answer Masking Probability: 30%
Evaluation Metrics
The model’s performance is evaluated using BLEU scores for both generated questions and answers. The following table summarizes the evaluation metrics on the test set:
Metric | Question | Answer |
---|---|---|
BLEU-1 | 0.3127 | 0.7243 |
BLEU-2 | 0.2073 | 0.5448 |
BLEU-3 | 0.1526 | 0.4036 |
BLEU-4 | 0.1159 | 0.3127 |
Note: BLEU scores measure n‑gram overlap between generated outputs and references. While useful, they do not capture every aspect of generation quality.
How to Use
You can easily leverage this model for inference using the Hugging Face Transformers pipeline:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "fares7elsadek/t5-base-finetuned-question-generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
def generate_qa(context, answer="[MASK]", max_length=64):
"""
Generates a question and answer pair from the provided context.
Args:
context (str): The context passage.
answer (str): The answer text. Use "[MASK]" to prompt the model to predict the answer.
max_length (int): Maximum length of the generated sequence.
Returns:
str: The generated question and answer pair.
"""
input_text = f"context: {context} answer: {answer} </s>"
inputs = tokenizer([input_text], return_tensors="pt", truncation=True, padding=True)
outputs = model.generate(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_length=max_length
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example inference:
context = "The Eiffel Tower was constructed in 1889 for the World's Fair in Paris."
answer = "The Eiffel Tower" # Alternatively, use "[MASK]" to have the model predict the answer
print(generate_qa(context, answer))
- Downloads last month
- 79
Model tree for fares7elsadek/t5-base-finetuned-question-generation
Base model
google-t5/t5-base