C-BERT / README.md
pdjohn's picture
Update README.md
2148abd verified
metadata
library_name: transformers
license: apache-2.0
language:
  - de
base_model:
  - google-bert/bert-base-german-cased
pipeline_tag: token-classification

C-BERT

CausalBERT (C-BERT) is a multi-task fine-tuned German BERT that extracts causal attributions.

Model details

  • Model architecture: BERT-base-German-cased + token & relation heads
  • Fine-tuned on: environmental causal attribution corpus (German)
  • Tasks:
    1. Token classification (BIO tags for INDICATOR / ENTITY)
    2. Relation classification (CAUSE, EFFECT, INTERDEPENDENCY)

Usage

Find the custom library. Once installed, run inference like so:

from transformers import AutoTokenizer
from causalbert.infer import load_model, analyze_sentence_with_confidence

model, tokenizer, config, device = load_model("norygano/C-BERT")
result = analyze_sentence_with_confidence(
    model, tokenizer, config, "Autoverkehr verursacht Bienensterben.", []
)

Training

  • Base model: google-bert/bert-base-german-cased
  • Epochs: 3, LR: 2e-5, Batch size: 8
  • See train.py for details.

Limitations

  • Only German.
  • Sentence-level; doesn’t handle cross-sentence causality.
  • Relation classification depends on detected spans — errors in token tagging propagate.