Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models Paper • 2502.14191 • Published 4 days ago • 4
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data Paper • 2502.14044 • Published 4 days ago • 6
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models Paper • 2502.14802 • Published 3 days ago • 9
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published 5 days ago • 22
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 3 days ago • 32
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published 3 days ago • 65
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 3 days ago • 148
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning Paper • 2502.11573 • Published 6 days ago • 7
Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering Paper • 2502.13962 • Published 4 days ago • 27
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 99
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1 Paper • 2502.12659 • Published 5 days ago • 5
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options Paper • 2502.12929 • Published 5 days ago • 6
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? Paper • 2502.12215 • Published 7 days ago • 15