Welcome to SAMF model [MICCAI' 25]!
[MICCAI' 25] From Slices to Volumes: Multi-Scale Fusion of 2D and 3D Features for CT Scan Report Generation
Model | Bleu1 | Bleu4 | RougeL | Meteor | Bert F1 | Llama Score |
---|---|---|---|---|---|---|
CT2Rep | 0.309 | 0.172 | 0.243 | 0.173 | 0.865 | 6.35 |
CT-Chat | 0.395 | - | 0.321 | 0.219 | - | 5.664 |
Our Baseline (SAMF) | 0.423 | 0.203 | 0.338 | 0.356 | 0.879 | 6.792 |
SAMF + Ao2D | 0.440 | 0.261 | 0.417 | 0.417 | 0.889 | 7.165 |
Introduction
Slice Attentive Multimodal Fusion (SAMF) , a framework that combines the rich, high-resolution information from 2D slices with the spatial coherence of 3D volumetric data. Experimental results demonstrate that our method outperforms existing baseline approaches in both report generation and multiple-choice question answering, highlighting the critical role of multidimensional feature integration.
Model Description
- Model type: 3D Medical Report Generation and Visual Question Answering
- Language(s) (NLP): English
- License: apache-2.0
- Finetuned from model: microsoft/Phi-3-mini-4k-instruct
Training Data
- Medical Report Generation and Visual Question Answering: ibrahimhamamci/CT-RATE, default subset
Hardware Utilization
- Hardware Type: 1 × NVIDIA-A100
- Hours used around 16 hours
Evaluation
To perform evaluation using this model, please refer to our GitHub repository (serag-ai/SAMF), which provides detailed information on how to use it.
- Downloads last month
- 241
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for serag-ai/SAMF
Base model
microsoft/Phi-3-mini-4k-instruct