DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning Paper • 2508.05405 • Published 14 days ago • 61