10 8 14

Shizhe Diao

shizhediao

https://shizhediao.github.io/

AI & ML interests

None yet

Recent Activity

updated a model about 10 hours ago

fed-agent/fed-agent

published a model about 10 hours ago

fed-agent/fed-agent

liked a dataset 26 days ago

shizhediao/SCP-116K-cleaned

View all activity

Organizations

updated a model about 10 hours ago

fed-agent/fed-agent

Updated about 10 hours ago

published a model about 10 hours ago

fed-agent/fed-agent

Updated about 10 hours ago

liked a dataset 26 days ago

shizhediao/SCP-116K-cleaned

Viewer • Updated Jun 11 • 25k • 93 • 6

updated a dataset 3 months ago

shizhediao/SCP-116K-cleaned

Viewer • Updated Jun 11 • 25k • 93 • 6

published a dataset 3 months ago

shizhediao/SCP-116K-cleaned

Viewer • Updated Jun 11 • 25k • 93 • 6

liked a model 3 months ago

nvidia/Nemotron-Research-Reasoning-Qwen-1.5B

Text Generation • 2B • Updated 26 days ago • 6.5k • 216

upvoted a paper 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 136

commented a paper 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 136 •

upvoted a paper 3 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 42

commented a paper 3 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 42 •

liked a dataset 4 months ago

nvidia/ClimbMix

Viewer • Updated Apr 22 • 355M • 730 • 31

updated a dataset 5 months ago

OptimalScale/ClimbLab

Viewer • Updated May 4 • 1.24B • 5.31k • 10

liked 3 datasets 5 months ago

authored a paper 5 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

upvoted a paper 5 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

commented a paper 5 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93 •

authored a paper 10 months ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 46

authored a paper 11 months ago

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Paper • 2410.03290 • Published Oct 4, 2024 • 7

Shizhe Diao

AI & ML interests

Recent Activity

Organizations

shizhediao's activity