Feng's picture

1 3

Feng

Yunzhen

https://fengyzpku.github.io/

fengyzpku

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

upvoted a paper 17 days ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

commented on a paper 17 days ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

View all activity

Organizations

None yet

Yunzhen's activity

upvoted a paper 13 days ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published 16 days ago • 21

upvoted a paper 17 days ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

Paper • 2502.04270 • Published 17 days ago • 11

commented a paper 17 days ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

Paper • 2502.04270 • Published 17 days ago • 11 •

upvoted a paper 12 months ago

Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 46

authored a paper about 1 year ago

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Paper • 2402.07043 • Published Feb 10, 2024 • 15