On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes Paper • 2306.13649 • Published Jun 23, 2023 • 25
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models Paper • 2508.09138 • Published 25 days ago • 36
GraphMind Collection More in https://arxiv.org/pdf/2507.17168, Graph Reasoning Model series • 4 items • Updated 16 days ago
Improving LLMs' Generalized Reasoning Abilities by Graph Problems Paper • 2507.17168 • Published Jul 23 • 1
GraphMind Collection More in https://arxiv.org/pdf/2507.17168, Graph Reasoning Model series • 4 items • Updated 16 days ago
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models Paper • 2508.10751 • Published 23 days ago • 27
Graph-R1 Collection 1.5B and 7B models training on 3 NP Graph Problems and achieved SOTA as GPT-4o and QwQ-32B • 3 items • Updated 21 days ago