Submitted by Yilun Zhao 12 Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers Yale NLP Lab 7 1