Yihe Deng's picture

Yihe Deng PRO

ydeng9

·

https://yihe-deng.notion.site/Yihe-Deng-167ab2d2c1fb80b3a76dfb120f716c84

Yihe__Deng

AI & ML interests

LLM post-training

Recent Activity

upvoted a paper 3 days ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

new activity about 1 month ago

ydeng9/OpenVLThinker-7B-v1.2:Add project page link to model card

published a dataset about 1 month ago

openvlthinker/OpenVLThinker_SFT_iter1

View all activity

Organizations

upvoted a paper 3 days ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published 7 days ago • 74

New activity in ydeng9/OpenVLThinker-7B-v1.2 about 1 month ago

Add project page link to model card

#1 opened about 1 month ago by

published a dataset about 1 month ago

openvlthinker/OpenVLThinker_SFT_iter1

Viewer • Updated May 1 • 22.8k • 32

updated a model about 1 month ago

ydeng9/OpenVLThinker-7B-v1.2

Image-Text-to-Text • 8B • Updated Aug 5 • 53 • 3

updated a collection about 2 months ago

OpenVLThinker-v1.2 Models

4 items • Updated Jul 21 • 2

published a model about 2 months ago

ydeng9/OpenVLThinker-3B-v1.2

4B • Updated May 15 • 6

liked a model about 2 months ago

ydeng9/OpenVLThinker-7B-v1.2

Image-Text-to-Text • 8B • Updated Aug 5 • 53 • 3

updated a model about 2 months ago

ydeng9/OpenVLThinker-7B-v1.2

Image-Text-to-Text • 8B • Updated Aug 5 • 53 • 3

updated a collection about 2 months ago

OpenVLThinker-v1.2 Datasets

3 items • Updated Jul 14 • 2

published a dataset about 2 months ago

ydeng9/OpenVLThinker-grpo-hard

Viewer • Updated May 6 • 6.25k • 26

updated a collection about 2 months ago

OpenVLThinker-v1.2 Models

4 items • Updated Jul 21 • 2

upvoted 2 papers about 2 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 130

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 180

updated a collection about 2 months ago

OpenVLThinker-v1.2 Datasets

3 items • Updated Jul 14 • 2

published a dataset about 2 months ago

ydeng9/OpenVLThinker-grpo-medium

Viewer • Updated Apr 24 • 3.3k • 48

updated a collection about 2 months ago

OpenVLThinker-v1.2 Models

4 items • Updated Jul 21 • 2

updated a model about 2 months ago

ydeng9/OpenVLThinker-7B-v1.2-medium-grpo-iter3

8B • Updated Jul 11 • 12

published a model about 2 months ago

ydeng9/OpenVLThinker-7B-v1.2-medium-grpo-iter3

8B • Updated Jul 11 • 12

updated 2 collections about 2 months ago

OpenVLThinker-v1.2 Datasets

3 items • Updated Jul 14 • 2

OpenVLThinker-v1.2 Models

4 items • Updated Jul 21 • 2