Wanwei He's picture

2 3 9

Wanwei He

Ancient

·

AI & ML interests

Dialog System

Recent Activity

commented on a paper 4 days ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

upvoted a paper 4 days ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

liked a dataset 26 days ago

llm-blender/Unified-Feedback

View all activity

Organizations

liked a dataset 26 days ago

llm-blender/Unified-Feedback

Viewer • Updated Mar 31, 2024 • 1.79M • 1.44k • 18

liked a dataset 3 months ago

allenai/reward-bench-2

Viewer • Updated Jun 4 • 1.87k • 9.79k • 24

liked a dataset 6 months ago

xiushenghuang/open_r1_dataset

Viewer • Updated Feb 26 • 2.59M • 178 • 5

liked a model 7 months ago

perplexity-ai/r1-1776

Text Generation • 671B • Updated Feb 26 • 42.6k • 2.31k

liked a Space 12 months ago

Reward Bench Leaderboard

Display and analyze reward model evaluation results

liked a dataset over 1 year ago

bigcode/the-stack-dedup

Viewer • Updated Aug 17, 2023 • 237M • 4.08k • 366

liked 3 datasets about 2 years ago

BAAI/COIG-PC

Viewer • Updated Jun 14, 2024 • 540M • 346 • 270

OpenAssistant/oasst1

Viewer • Updated May 2, 2023 • 88.8k • 7.22k • 1.43k

RyokoAI/ShareGPT52K

Preview • Updated Apr 2, 2023 • 470 • 341