Jiang's picture

7 8 2

Jiang

Dongwei

·

Some-random

AI & ML interests

None yet

Recent Activity

authored a paper 8 days ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

updated a dataset 13 days ago

Dongwei/Complete_Model_Results_Dataset

published a dataset 13 days ago

Dongwei/Complete_Model_Results_Dataset

View all activity

Organizations

Papers 4

arxiv:2506.11930

arxiv:2410.01044

arxiv:2409.12183

arxiv:2407.09007

models 17

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_newdata

Text Generation • 8B • Updated Feb 13 • 3

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_longer

Text Generation • 8B • Updated Feb 11 • 3

Dongwei/Qwen-2.5-7B_Base_Math_smallestlr

Text Generation • 8B • Updated Feb 11 • 3

Dongwei/Qwen-2.5-7B_Base_Math_smallestlr_newdata

Text Generation • 8B • Updated Feb 5 • 3

Dongwei/Qwen-2.5-7B_Base_Math_smalllr

Text Generation • 8B • Updated Feb 5 • 2 • 6

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math_lowlr

Text Generation • 8B • Updated Feb 4 • 3

Dongwei/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math_smalllr

Text Generation • 2B • Updated Feb 4 • 2

Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math_smalllr

Text Generation • 2B • Updated Feb 4 • 2

Dongwei/Qwen-2.5-7B_Math_smalllr

Text Generation • 8B • Updated Feb 4 • 2

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math

Text Generation • 8B • Updated Feb 4 • 8

datasets 5

Dongwei/Complete_Model_Results_Dataset

Viewer • Updated 13 days ago • 3.5k • 108

Dongwei/Comprehensive_Feedback_Dataset

Viewer • Updated 13 days ago • 1.4k • 99

Dongwei/Feedback_Friction_Dataset

Viewer • Updated Jun 17 • 394 • 201 • 2

Dongwei/Math_8K_for_GRPO

Viewer • Updated Feb 5 • 8.89k • 37 • 3

Dongwei/reasoning_world_model

Viewer • Updated Apr 22, 2024 • 15.2k • 9 • 6