metadata
license: apache-2.0
datasets:
- ibrahimhamamci/CT-RATE
metrics:
- bleu
- bertscore
- rouge
base_model:
- microsoft/Phi-3-mini-4k-instruct
tags:
- biology
- medical
Welcome to SAMF model [MICCAI' 25]!
[MICCAI' 25] From Slices to Volumes: Multi-Scale Fusion of 2D and 3D Features for CT Scan Report Generation
Model | Bleu1 | Bleu4 | RougeL | Meteor | Bert F1 | Llama Score |
---|---|---|---|---|---|---|
CT2Rep | 0.309 | 0.172 | 0.243 | 0.173 | 0.865 | 6.35 |
CT-Chat | 0.395 | - | 0.321 | 0.219 | - | 5.664 |
Our Baseline (SAMF) | 0.423 | 0.203 | 0.338 | 0.356 | 0.879 | 6.792 |
SAMF + Ao2D | 0.440 | 0.261 | 0.417 | 0.417 | 0.889 | 7.165 |
Introduction
Slice Attentive Multimodal Fusion (SAMF) , a framework that combines the rich, high-resolution information from 2D slices with the spatial coherence of 3D volumetric data. Experimental results demonstrate that our method outperforms existing baseline approaches in both report generation and multiple-choice question answering, highlighting the critical role of multidimensional feature integration.
Model Description
- Model type: 3D Medical Report Generation and Visual Question Answering
- Language(s) (NLP): English
- License: apache-2.0
- Finetuned from model: microsoft/Phi-3-mini-4k-instruct
Training Data
- Medical Report Generation and Visual Question Answering: ibrahimhamamci/CT-RATE, default subset
Hardware Utilization
- Hardware Type: 1 × NVIDIA-A100
- Hours used around 16 hours
Evaluation
To perform evaluation using this model, please refer to our GitHub repository (serag-ai/SAMF), which provides detailed information on how to use it.