-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 29 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 42 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2509.02547
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143 -
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use
Paper • 2509.01055 • Published • 59 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 76
-
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR
Paper • 2509.02522 • Published • 22 -
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 85 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143 -
Towards a Unified View of Large Language Model Post-Training
Paper • 2509.04419 • Published • 46
-
Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers
Paper • 2410.08304 • Published -
Analytical Lyapunov Function Discovery: An RL-based Generative Approach
Paper • 2502.02014 • Published -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 100 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 76 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 29 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 42 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 55 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR
Paper • 2509.02522 • Published • 22 -
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 85 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143 -
Towards a Unified View of Large Language Model Post-Training
Paper • 2509.04419 • Published • 46
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143 -
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use
Paper • 2509.01055 • Published • 59 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 76
-
Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers
Paper • 2410.08304 • Published -
Analytical Lyapunov Function Discovery: An RL-based Generative Approach
Paper • 2502.02014 • Published -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 100 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 76 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 143