--- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct language: - en library_name: peft license: llama3.1 pipeline_tag: text2text-generation --- # LLaMA-3.1-8B-LoRA-COCO-Deceptive-CLIP Model Card > 🏆 **This work is accepted to ACL 2025 (Main Conference).**

main result Figure: Attack success rate (ASR) and caption diversity of our model on the COCO dataset, illustrating its ability to generate deceptive captions that successfully fool CLIP.

## Model Description - **Repository:** [Code](https://github.com/ahnjaewoo/MAC) - **Paper:** [Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates](https://arxiv.org/abs/2505.22943) - **Point of Contact:** [Jaewoo Ahn](mailto:jaewoo.ahn@vision.snu.ac.kr), [Heeseung Yun](mailto:heeseung.yun@vision.snu.ac.kr) ## Model Details - **Model**: *LLaMA-3.1-8B-LoRA-COCO-Deceptive-CLIP* is a deceptive caption generator built on **LLaMA-3.1-8B**, fine-tuned using LoRA (i.e., *self-training*, or more specifically, *rejection sampling fine-tuning (RFT)*) to deceive **CLIP** on the **COCO** dataset. It achieves an **attack success rate (ASR)** of **42.1%**.\ - **Architecture**: This model is based on [LLaMA-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) and utilizes [PEFT](https://github.com/huggingface/peft) v0.12.0 for efficient fine-tuning. ## How to Use See our GitHub [repository](https://github.com/ahnjaewoo/MAC) for full usage instructions and scripts.