Richard Zhuang's picture

6 10 9

Richard Zhuang PRO

RZ412

·

https://richardzhuang0412.github.io

AI & ML interests

LLM Routing, LLM + Games, Post-Training, Agents

Recent Activity

upvoted a collection 14 days ago

OpenThinker-Agent

liked a dataset 14 days ago

open-thoughts/OpenThoughts-Agent-v1-SFT

upvoted a collection 14 days ago

Olmo 3 Post-training

View all activity

Organizations

upvoted 2 collections 14 days ago

OpenThinker-Agent

5 items • Updated 21 days ago • 5

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 3 days ago • 46

upvoted a paper 21 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published 23 days ago • 149

upvoted a paper 3 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8

•

739

upvoted 2 collections 5 months ago

Reasoning Datasets

50 items • Updated Jun 8 • 10

Reasoning Models

53 items • Updated Jun 8 • 1

upvoted an article 9 months ago

Article

Reasoning Datasets Competition

Apr 9

•

38

upvoted a paper 11 months ago

PokerBench: Training Large Language Models to become Professional Poker Players

Paper • 2501.08328 • Published Jan 14 • 19

upvoted a paper about 1 year ago

EmbedLLM: Learning Compact Representations of Large Language Models

Paper • 2410.02223 • Published Oct 3, 2024 • 3