Vision-Language Models Struggle to Align Entities across Modalities
Paper
•
2503.03854
•
Published
•
1
Vision-Language Models Struggle to Align Entities across Modalities
Totally Free + Zero Barriers + No Login Required