RooseBERT-large-scr-cased
This model is a fine-tuned version of bert-large-cased.
It achieves the following results on the evaluation set:
- Loss: 0.9116
- Accuracy: 0.7799
- Perplexity 2.601
Model description
This model builds on the same architecture as bert-base-cased
, leveraging transformer-based contextual embeddings to better understand the nuances of political language.
Intended Use Cases
Suitable Applications
- Political discourse analysis: Identifying patterns, sentiments, and rhetoric in debates.
- Contextual word interpretation: Understanding the meaning of words within political contexts.
- Sentiment classification: Differentiating positive, neutral, and negative sentiments in political speech.
- Text generation improvement: Enhancing auto-completions and summaries in politically focused language models.
Limitations
- Bias Sensitivity: Since it was trained on political debates, inherent biases in the data may be reflected in the model’s outputs.
- Not Suitable for General-Purpose NLP: Its optimization is specific for political contexts.
- Does Not Perform Fact-Checking: The model does not verify factual accuracy.
Training and Evaluation Data
The model was trained on a curated dataset of political debates sourced from:
- Parliamentary transcripts
- Presidential debates and public speeches
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 4
- total_train_batch_size: 2048
- total_eval_batch_size: 512
- optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- training_steps: 125000
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Accuracy |
Validation Loss |
No log |
0 |
0 |
0.0000 |
10.5234 |
1.2678 |
12.6967 |
50000 |
0.7217 |
1.2314 |
1.121 |
25.3936 |
100000 |
0.7453 |
1.0977 |
0.9192 |
137.3328 |
125000 |
0.9111 |
0.7799 |
Framework versions
- Transformers 4.49.0.dev0
- Pytorch 2.5.1
- Datasets 3.2.0
- Tokenizers 0.21.0
Citation
If you use this model, cite us:
@misc{
dore2025roosebertnewdealpolitical,
title={RooseBERT: A New Deal For Political Language Modelling},
author={Deborah Dore and Elena Cabrio and Serena Villata},
year={2025},
eprint={2508.03250},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.03250},
}