ppddddpp commited on
Commit
71b8e11
·
verified ·
1 Parent(s): 9775dbd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - chest-xray
5
+ - medical
6
+ - multimodal
7
+ - retrieval
8
+ - explanation
9
+ - clinicalbert
10
+ - swin-transformer
11
+ - deep-learning
12
+ - image-text
13
+ datasets:
14
+ - openi
15
+ language:
16
+ - en
17
+ ---
18
+
19
+ # Multimodal Chest X-ray Retrieval & Diagnosis (ClinicalBERT + Swin)
20
+
21
+ This model jointly encodes chest X-rays (DICOM) and radiology reports (XML) to:
22
+
23
+ - Predict medical conditions from multimodal input (image + text)
24
+ - Retrieve similar cases using shared disease-aware embeddings
25
+ - Provide visual explanations using attention and Integrated Gradients (IG)
26
+
27
+ > Developed as a final project at HCMUS.
28
+
29
+ ---
30
+
31
+ ## Model Architecture
32
+
33
+ - **Image Encoder:** Swin Transformer (pretrained, fine-tuned)
34
+ - **Text Encoder:** ClinicalBERT
35
+ - **Fusion Module:** Cross-modal attention with optional hybrid FFN layers
36
+ - **Losses:** BCE + Focal Loss for multi-label classification
37
+
38
+ Embeddings from both modalities are projected into a **shared joint space**, enabling retrieval and explanation.
39
+
40
+ ---
41
+
42
+ ## Training Data
43
+
44
+ - **Dataset:** [NIH Open-i Chest X-ray Dataset](https://openi.nlm.nih.gov/)
45
+ - **Input Modalities:**
46
+ - Chest X-ray DICOMs
47
+ - Associated XML radiology reports
48
+ - **Labels:** MeSH-derived disease categories (multi-label)
49
+
50
+ ---
51
+
52
+ ## Intended Uses
53
+ * Clinical Education: Case similarity search for radiology students
54
+
55
+ * Research: Baseline for multimodal medical retrieval
56
+
57
+ * Explainability: Visualize disease evidence in both image and text
58
+
59
+ ## Limitations & Risks
60
+ * Trained on a public dataset (Open-i) — may not generalize to other hospitals
61
+
62
+ * Explanations are not clinically validated
63
+
64
+ * Not for diagnostic use in real-world settings
65
+
66
+ ## Acknowledgments
67
+ * NIH Open-i Dataset
68
+
69
+ * Swin Transformer (Timm)
70
+
71
+ * ClinicalBERT (Emily Alsentzer)
72
+
73
+ * Captum (for IG explanations)
74
+
75
+ ## Links
76
+ GitHub: