zhang
kekueknu2
·
AI & ML interests
None yet
Recent Activity
upvoted a paper about 2 months ago
daVinci-Dev: Agent-native Mid-training for Software Engineering upvoted an article about 1 year ago
From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning upvoted an article over 1 year ago
Illustrating Reinforcement Learning from Human Feedback (RLHF)