Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward
Abstract
Atom-Searcher, an RL framework integrating Atomic Thought and Reasoning Reward Models, enhances LLMs' multi-hop reasoning and strategic search capabilities, improving performance and interpretability.
Large language models (LLMs) exhibit remarkable problem-solving abilities, but struggle with complex tasks due to static internal knowledge. Retrieval-Augmented Generation (RAG) enhances access to external information, yet remains limited in multi-hop reasoning and strategic search due to rigid workflows. Recent advancements in agentic deep research empower LLMs to autonomously reason, search, and synthesize information. However, current approaches relying on outcome-based reinforcement learning (RL) face critical issues such as conflicting gradients and reward sparsity, limiting performance gains and training efficiency. To address these, we first propose Atomic Thought, a novel LLM thinking paradigm that decomposes reasoning into fine-grained functional units. These units are supervised by Reasoning Reward Models (RRMs), which provide Atomic Thought Rewards (ATR) for fine-grained guidance. Building on this, we propose Atom-Searcher, a novel RL framework for agentic deep research that integrates Atomic Thought and ATR. Atom-Searcher uses a curriculum-inspired reward schedule, prioritizing process-level ATR early and transitioning to outcome rewards, accelerating convergence on effective reasoning paths. Experiments on seven benchmarks show consistent improvements over the state-of-the-art. Key advantages include: (1) Atom-Searcher scales computation at test-time. (2) Atomic Thought provides supervision anchors for RRMs, bridging deep research tasks and RRMs. (3) Atom-Searcher exhibits more interpretable, human-like reasoning patterns.
Community
Atom-Searcher is a novel framework designed to enhance the deep research capabilities of Large Language Models (LLMs). While LLMs show great promise, their static internal knowledge limits their ability to handle complex, multi-step tasksExisting methods like Retrieval-Augmented Generation (RAG) and outcome-based reinforcement learning (RL) often fall short due to rigid workflows, reward sparsity, and conflicting gradients during training.
To overcome these challenges, we introduce Atom-Searcher, a new reinforcement learning framework built on the concept of Atomic Thought. This paradigm decomposes complex reasoning into fine-grained, functional units. Each "atomic thought" is evaluated by a Reasoning Reward Model (RRM), providing a fine-grained Atomic Thought Reward (ATR) that guides the agent's learning process.
The framework uses a curriculum-inspired reward schedule that initially prioritizes high-quality reasoning processes before shifting focus to final outcomes, which accelerates the discovery of effective problem-solving strategies.
Key advantages of Atom-Searcher include:
State-of-the-Art Performance: Achieves consistent improvements over existing models on seven different benchmarks.
Enhanced Interpretability: Exhibits more human-like and understandable reasoning patterns by breaking down its thought process.
Efficient Training: Mitigates issues of reward sparsity and gradient conflicts, leading to more efficient policy optimization.
Scalable Computation: Effectively scales its computational efforts during test-time to tackle more complex queries.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- From Sufficiency to Reflection: Reinforcement-Guided Thinking Quality in Retrieval-Augmented Reasoning for LLMs (2025)
- SSRL: Self-Search Reinforcement Learning (2025)
- RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism (2025)
- UR$^2$: Unify RAG and Reasoning through Reinforcement Learning (2025)
- Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning (2025)
- L0: Reinforcement Learning to Become General Agents (2025)
- GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper