--- license: apache-2.0 datasets: - psp-dada/SENTINEL language: - en base_model: - Qwen/Qwen2-VL-2B-Instruct pipeline_tag: image-text-to-text library_name: transformers tags: - lora --- # Model Card for ``psp-dada/Qwen2-VL-2B-Instruct-SENTINEL`` | ICCV2025 | SENTINEL:
Mitigating Object Hallucinations via Sentence-Level Early Intervention

## 🎊 News - [2025.07.21] All code, data, and models are released! - [2025.06.26] 🎉 Our SENTINEL is accepted by **ICCV 2025**! ## 🚀 Overview **SENTINEL** introduces an automatic, sentence‑level early intervention strategy to prevent and mitigate object hallucinations in multimodal large language models (MLLMs). Key advantages: - **Annotation‑free**: No human labeling required. - **Model-agnostic**: Compatible with any MLLM architecture. - **Efficient**: Lightweight LoRA fine‑tuning. ## 🔑 Key Features - 🧠 **Early intervention halts hallucination propagation**. We find that hallucinations of MLLMs predominantly arise in early sentences and propagate through the rest of the output. SENTINEL interrupts this chain early to maximize mitigation.

- 🔍 **In-domain contextual preference learning without human labels**. SENTINEL constructs hallucinated/factual samples via detector cross-validation and builds context-aware preference data without relying on proprietary LLMs or manual annotations.

- 💡 **Context matters: rich coherence drives robustness**. By prioritizing context-coherent positive samples over hallucinated ones, SENTINEL significantly boosts generalization.

- ♻️ **Iterative contextual bootstrapping for diverse hallucination-free contexts**. Our pipeline dynamically grows non-hallucinated contexts and expands coverage across varied scenes, improving robustness across generations.

- 📊 **State-of-the-art results across benchmarks**. SENTINEL achieves **up to 92% reduction** in hallucinations and outperforms prior SOTA methods across Object HalBench, AMBER, and HallusionBench, while maintaining or improving general task performance.

## How to use This model is a PEFT (LoRA) adapter. You first need to load the base model (`Qwen/Qwen2-VL-2B-Instruct`) and then load this adapter on top of it. **For the details of this model, please refer to the [documentation](https://github.com/pspdada/SENTINEL?tab=readme-ov-file#-model-weights) of the GitHub repo.** ## 📝 Citation If you find our model/code/data/paper helpful, please consider citing our papers 📝 and starring us ⭐️！ ```bibtex @article{peng2025mitigating, title={Mitigating Object Hallucinations via Sentence-Level Early Intervention}, author={Peng, Shangpin and Yang, Senqiao and Jiang, Li and Tian, Zhuotao}, journal={arXiv preprint arXiv:2507.12455}, year={2025} } ``` ## 📧 Contact us If you have any questions, comments, or suggestions, please do not hesitate to submit an issue or PR to help advance research in this area.