13 15 16

Yuzhen Huang

yuzhen17

https://hyz17.github.io

HYZ17

AI & ML interests

None yet

Recent Activity

updated a model about 5 hours ago

yuzhen17/tmp_mistral_by_qwen_test

published a model about 5 hours ago

yuzhen17/tmp_mistral_by_qwen_test

new activity about 6 hours ago

hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zero:Adding `safetensors` variant of this model

View all activity

Organizations

yuzhen17's activity

updated a model about 5 hours ago

yuzhen17/tmp_mistral_by_qwen_test

Updated about 5 hours ago

published a model about 5 hours ago

yuzhen17/tmp_mistral_by_qwen_test

Updated about 5 hours ago

New activity in hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zero about 6 hours ago

Adding `safetensors` variant of this model

#1 opened 2 days ago by

SFconvertbot

New activity in hkust-nlp/Qwen-2.5-Math-7B-SimpleRL about 6 hours ago

Adding `safetensors` variant of this model

#2 opened about 9 hours ago by

SFconvertbot

liked a Space 4 days ago

1.36k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

New activity in hkust-nlp/Qwen-2.5-Math-7B-SimpleRL 4 days ago

Update README.md

#1 opened 5 days ago by

AndrewZeng

updated a collection 5 days ago

SimpleRL

Collection

The collection for the Project "Simple Reinforcement Learning for Reasoning" • 2 items • Updated 5 days ago • 4

updated 2 models 5 days ago

hkust-nlp/Qwen-2.5-Math-7B-SimpleRL

Updated about 6 hours ago • 35 • 1

hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zero

Updated about 6 hours ago • 154 • 2

published 2 models 5 days ago

hkust-nlp/Qwen-2.5-Math-7B-SimpleRL

Updated about 6 hours ago • 35 • 1

hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zero

Updated about 6 hours ago • 154 • 2

upvoted a paper 5 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 7 days ago • 133

updated a dataset 5 days ago

hkust-nlp/PreSelect-100B

Viewer • Updated 5 days ago • 54.5M • 346 • 1

New activity in hkust-nlp/PreSelect-100B 5 days ago

Delete DCLM-refinedweb

#2 opened 5 days ago by

yuzhen17

Delete DCLM-refinedweb

#1 opened 5 days ago by

yuzhen17

upvoted a paper about 1 month ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 91

liked a dataset about 2 months ago

leafspark/o1_reflection

Viewer • Updated Oct 7, 2024 • 3.75k • 99 • 2

authored a paper about 2 months ago

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46

upvoted a paper 2 months ago

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46