Xiangyan Liu's picture

1 29

Xiangyan Liu

xyliu6

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

upvoted a paper about 2 months ago

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

View all activity

Organizations

None yet

upvoted a paper 24 days ago

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Paper • 2512.05150 • Published Dec 3, 2025 • 74

upvoted 2 papers about 2 months ago

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Paper • 2511.06209 • Published Nov 9, 2025 • 18

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

upvoted 4 papers 3 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 89

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 174

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 69

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70

upvoted a paper 6 months ago

Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Paper • 2502.11962 • Published Feb 17, 2025 • 38

upvoted 7 papers 7 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 52

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 187

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28, 2025 • 48

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28, 2025 • 29

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26, 2025 • 23

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Paper • 2505.18536 • Published May 24, 2025 • 18

upvoted 3 papers 8 months ago

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19, 2025 • 36

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12, 2025 • 32

REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback

Paper • 2505.06548 • Published May 10, 2025 • 30

upvoted a paper 9 months ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21, 2025 • 47

upvoted a collection 9 months ago

🚀 Active PRM

Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16, 2025 • 3