Yihe Deng's picture

Yihe Deng PRO

ydeng9

·

https://yihe-deng.notion.site/Yihe-Deng-167ab2d2c1fb80b3a76dfb120f716c84

Yihe__Deng

AI & ML interests

LLM post-training

Recent Activity

updated a dataset 16 days ago

ydeng9/OpenVLThinker-grpo-hard

updated a dataset 16 days ago

ydeng9/OpenVLThinker-grpo-medium

published a dataset 4 months ago

ydeng9/swe-smith-rl-distill

View all activity

Organizations

New activity in ydeng9/OpenVLThinker-7B-v1.2 6 months ago

Add project page link to model card

#1 opened 6 months ago by

New activity in ydeng9/OpenVLThinker-7B 10 months ago

Highlight code

#2 opened 10 months ago by

Add library name and pipeline tag

#1 opened 10 months ago by

commented a paper 10 months ago

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published Mar 21, 2025 • 24 •

New activity in DuoGuard/DuoGuard-1.5B-transfer 12 months ago

Add link to code

#1 opened 12 months ago by

New activity in DuoGuard/DuoGuard-1B-Llama-3.2-transfer 12 months ago

Add link to Github repository

#1 opened 12 months ago by

New activity in DuoGuard/DuoGuard-0.5B 12 months ago

Add link to Github repository

#3 opened 12 months ago by

Add library name

#2 opened 12 months ago by

Add link to paper, add pipeline tag

#1 opened 12 months ago by

commented a paper 12 months ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published Feb 7, 2025 • 22 •

commented a paper about 1 year ago

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 18 •

commented 2 papers over 1 year ago

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 almost 2 years ago

Training code

#1 opened almost 2 years ago by

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 almost 2 years ago

How to reproduce the results ?

#1 opened about 2 years ago by

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 about 2 years ago

How to reproduce the results ?

#1 opened about 2 years ago by