|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- de |
|
base_model: |
|
- google-bert/bert-base-german-cased |
|
pipeline_tag: token-classification |
|
--- |
|
|
|
# C-BERT |
|
|
|
CausalBERT (C-BERT) is a multi-task fine-tuned German BERT that extracts causal attributions. |
|
|
|
## Model details |
|
- **Model architecture**: BERT-base-German-cased + token & relation heads |
|
- **Fine-tuned on**: environmental causal attribution corpus (German) |
|
- **Tasks**: |
|
1. Token classification (BIO tags for INDICATOR / ENTITY) |
|
2. Relation classification (CAUSE, EFFECT, INTERDEPENDENCY) |
|
|
|
## Usage |
|
Find the custom [library](https://github.com/norygami/causalbert). Once installed, run inference like so: |
|
```python |
|
from transformers import AutoTokenizer |
|
from causalbert.infer import load_model, analyze_sentence_with_confidence |
|
|
|
model, tokenizer, config, device = load_model("norygano/C-BERT") |
|
result = analyze_sentence_with_confidence( |
|
model, tokenizer, config, "Autoverkehr verursacht Bienensterben.", [] |
|
) |
|
``` |
|
|
|
## Training |
|
|
|
- **Base model**: `google-bert/bert-base-german-cased` |
|
- **Epochs**: 3, **LR**: 2e-5, **Batch size**: 8 |
|
- See [train.py](https://github.com/norygami/causalbert/blob/main/causalbert/train.py) for details. |
|
|
|
## Limitations |
|
|
|
- Only German. |
|
- Sentence-level; doesn’t handle cross-sentence causality. |
|
- Relation classification depends on detected spans — errors in token tagging propagate. |