Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
Paper
•
2512.21337
•
Published
•
23
Note 对比流行非流行建筑的年代检测准确性,可以看出VLM泛化能力/是否更多依靠memory? (这是典型任务?)
Totally Free + Zero Barriers + No Login Required