-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 106 -
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper • 2501.10120 • Published • 43 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 29 -
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
Paper • 2501.10132 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2502.01142
-
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
Paper • 2501.13726 • Published -
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement
Paper • 2412.12881 • Published • 2 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 23
-
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Paper • 2502.00674 • Published • 12 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 51 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 190 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 23
-
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 23 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 23 -
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Paper • 2502.01100 • Published • 15 -
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Paper • 2502.01081 • Published • 14
-
Chain-of-Retrieval Augmented Generation
Paper • 2501.14342 • Published • 51 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 23 -
SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model
Paper • 2501.18636 • Published • 27
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 257 -
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Paper • 2501.04686 • Published • 50 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 90 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 85
-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 40 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 99 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 82 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 25