BioBERT Disease NER

Model Banner

Biomedical NER model fine-tuned on BioBERT using the NCBI Disease dataset to extract disease mentions from biomedical text.

🔗 Live Demo (Disease-Extraction-System)

https://disease-extraction-system.vercel.app/

📂 GitHub

https://github.com/IshanSalunkhe6/disease-extraction-system

📊 Performance

Metric Score
Precision 86.80%
Recall 91.39%
F1-score 89.04%
Accuracy 98.64%

📚 Training Data

  • Dataset: NCBI Disease
  • Size: 6,800+ annotated mentions from 793 PubMed abstracts

🛠️ How to Use

from transformers import pipeline

nlp = pipeline(
    "ner",
    model="Ishan0612/biobert-ner-disease-ncbi",
    tokenizer="Ishan0612/biobert-ner-disease-ncbi",
    aggregation_strategy="simple"
)

text = "The patient has signs of diabetes mellitus and chronic obstructive pulmonary disease."
results = nlp(text)

for entity in results:
    print(f"{entity['word']} - ({entity['entity_group']})")

This should output:

Extracted Medical Entities:

the patient has signs of - (LABEL_0)

diabetes - (LABEL_1)

mellitus - (LABEL_2)

and - (LABEL_0)

chronic - (LABEL_1)

obstructive pulmonary disease - (LABEL_2)

. - (LABEL_0)

License

This model is licensed under the Apache 2.0 License, same as the original BioBERT (dmis-lab/biobert-base-cased-v1.1).

Citation

@article{lee2020biobert, title={BioBERT: a pre-trained biomedical language representation model for biomedical text mining}, author={Lee, Jinhyuk and Yoon, Wonjin and Kim, Sungdong and Kim, Donghyeon and So, Chan Ho and Kang, Jaewoo}, journal={Bioinformatics}, volume={36}, number={4}, pages={1234--1240}, year={2020}, publisher={Oxford University Press} }

Downloads last month
1,188
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Providers NEW

Model tree for Ishan0612/biobert-ner-disease-ncbi

Finetuned
(28)
this model

Dataset used to train Ishan0612/biobert-ner-disease-ncbi

Spaces using Ishan0612/biobert-ner-disease-ncbi 2