File size: 1,401 Bytes
007adc7
 
8eb6928
 
 
 
 
1c88e5b
007adc7
 
2148abd
007adc7
2148abd
cb35363
 
 
c318a27
cb35363
 
 
 
 
9024c04
cb35363
 
 
 
 
 
 
 
 
 
 
 
 
 
9024c04
cb35363
 
 
 
 
c318a27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
library_name: transformers
license: apache-2.0
language:
- de
base_model:
- google-bert/bert-base-german-cased
pipeline_tag: token-classification
---

# C-BERT

CausalBERT (C-BERT) is a multi-task fine-tuned German BERT that extracts causal attributions.

## Model details
- **Model architecture**: BERT-base-German-cased + token & relation heads  
- **Fine-tuned on**: environmental causal attribution corpus (German)  
- **Tasks**:  
  1. Token classification (BIO tags for INDICATOR / ENTITY)  
  2. Relation classification (CAUSE, EFFECT, INTERDEPENDENCY)

## Usage
Find the custom [library](https://github.com/norygami/causalbert). Once installed, run inference like so:
```python
from transformers import AutoTokenizer
from causalbert.infer import load_model, analyze_sentence_with_confidence

model, tokenizer, config, device = load_model("norygano/C-BERT")
result = analyze_sentence_with_confidence(
    model, tokenizer, config, "Autoverkehr verursacht Bienensterben.", []
)
```

## Training

- **Base model**: `google-bert/bert-base-german-cased`  
- **Epochs**: 3, **LR**: 2e-5, **Batch size**: 8  
- See [train.py](https://github.com/norygami/causalbert/blob/main/causalbert/train.py) for details.

## Limitations

- Only German.  
- Sentence-level; doesn’t handle cross-sentence causality.  
- Relation classification depends on detected spans — errors in token tagging propagate.