wenxueru's picture

5 14

wenxueru

Aunderline

·

https://github.com/wenxueru

Aunderline

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

upvoted a paper 2 months ago

Reinforcement Pre-Training

authored a paper 3 months ago

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

View all activity

Organizations

None yet

upvoted a paper 4 days ago

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published 10 days ago • 28

upvoted a paper 2 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 260

upvoted a paper 3 months ago

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Paper • 2408.16326 • Published Aug 29, 2024 • 1

upvoted a collection 4 months ago

Qwen3

84 items • Updated Aug 6 • 1.19k

upvoted a paper 8 months ago

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Paper • 2501.01830 • Published Jan 3 • 18