view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 355
RealQA Collection Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model • 3 items • Updated Jun 3
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning Paper • 2505.14231 • Published May 20 • 53
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning Paper • 2505.14231 • Published May 20 • 53 • 5
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model Paper • 2503.06141 • Published Mar 8 • 4 • 2