BioBERT Disease NER

Biomedical NER model fine-tuned on BioBERT using the NCBI Disease dataset to extract disease mentions from biomedical text.
🔗 Live Demo (Disease-Extraction-System)
https://disease-extraction-system.vercel.app/
📂 GitHub
https://github.com/IshanSalunkhe6/disease-extraction-system
📊 Performance
Metric | Score |
---|---|
Precision | 86.80% |
Recall | 91.39% |
F1-score | 89.04% |
Accuracy | 98.64% |
📚 Training Data
- Dataset: NCBI Disease
- Size: 6,800+ annotated mentions from 793 PubMed abstracts
🛠️ How to Use
from transformers import pipeline
nlp = pipeline(
"ner",
model="Ishan0612/biobert-ner-disease-ncbi",
tokenizer="Ishan0612/biobert-ner-disease-ncbi",
aggregation_strategy="simple"
)
text = "The patient has signs of diabetes mellitus and chronic obstructive pulmonary disease."
results = nlp(text)
for entity in results:
print(f"{entity['word']} - ({entity['entity_group']})")
This should output:
Extracted Medical Entities:
the patient has signs of - (LABEL_0)
diabetes - (LABEL_1)
mellitus - (LABEL_2)
and - (LABEL_0)
chronic - (LABEL_1)
obstructive pulmonary disease - (LABEL_2)
. - (LABEL_0)
License
This model is licensed under the Apache 2.0 License, same as the original BioBERT (dmis-lab/biobert-base-cased-v1.1
).
Citation
@article{lee2020biobert, title={BioBERT: a pre-trained biomedical language representation model for biomedical text mining}, author={Lee, Jinhyuk and Yoon, Wonjin and Kim, Sungdong and Kim, Donghyeon and So, Chan Ho and Kang, Jaewoo}, journal={Bioinformatics}, volume={36}, number={4}, pages={1234--1240}, year={2020}, publisher={Oxford University Press} }
- Downloads last month
- 1,188
Model tree for Ishan0612/biobert-ner-disease-ncbi
Base model
dmis-lab/biobert-base-cased-v1.1