9 4 3

Shengnan An

ShengnanAn

AI & ML interests

None yet

Recent Activity

commented on a paper 12 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

new activity 19 days ago

deepseek-ai/DeepSeek-V3.2:Default system prompt can hinder thinking-mode performance

new activity 23 days ago

deepseek-ai/DeepSeek-V3.2:Should the thinking mode in DS v3.2 use the default system prompt?

View all activity

Organizations

commented a paper 12 days ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 17 days ago • 45 •

New activity in deepseek-ai/DeepSeek-V3.2 19 days ago

Default system prompt can hinder thinking-mode performance

#31 opened 19 days ago by

ShengnanAn

New activity in deepseek-ai/DeepSeek-V3.2 23 days ago

Should the thinking mode in DS v3.2 use the default system prompt?

#24 opened 23 days ago by

ShengnanAn

updated a dataset 28 days ago

meituan-longcat/AMO-Bench

Viewer • Updated 28 days ago • 50 • 640 • 24

New activity in moonshotai/Kimi-K2-Thinking about 2 months ago

Awesome work! Do you want to try AMO-Bench, the most challenging MO-level benchmark?

#3 opened about 2 months ago by

ShengnanAn

authored a paper about 2 months ago

AMO-Bench: Large Language Models Still Struggle in High School Math Competitions

Paper • 2510.26768 • Published Oct 30 • 33

New activity in meituan-longcat/AMO-Bench about 2 months ago

Problem 35: Official solution misuses “positive integers”, final count should be 7656 (not 7657)

#4 opened about 2 months ago by

applesilicon

liked a dataset about 2 months ago

meituan-longcat/UNO-Bench

Viewer • Updated 24 days ago • 3.73k • 5.51k • 21

New activity in meituan-longcat/AMO-Bench about 2 months ago

Improve dataset card: Add task category, paper, project page, code, abstract, features, leaderboard, and sample usage

👍 1

#2 opened about 2 months ago by

nielsr

updated a dataset about 2 months ago

meituan-longcat/UNO-Bench

Viewer • Updated 24 days ago • 3.73k • 5.51k • 21

liked a dataset about 2 months ago

meituan-longcat/AMO-Bench

Viewer • Updated 28 days ago • 50 • 640 • 24

New activity in meituan-longcat/AMO-Bench about 2 months ago

Problem 26 seems identical to Berkeley Math Circle 2014–2015 Monthly Contest 3, Problem 4

#3 opened about 2 months ago by

applesilicon

upvoted a paper about 2 months ago

AMO-Bench: Large Language Models Still Struggle in High School Math Competitions

Paper • 2510.26768 • Published Oct 30 • 33

commented a paper about 2 months ago

AMO-Bench: Large Language Models Still Struggle in High School Math Competitions

Paper • 2510.26768 • Published Oct 30 • 33 •

upvoted a paper over 1 year ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 56

liked a model over 1 year ago

In2Training/FILM-7B

Text Generation • 7B • Updated May 9, 2024 • 68 • 11

updated 3 models over 1 year ago

authored a paper over 1 year ago

Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25, 2024 • 55

Shengnan An

AI & ML interests

Recent Activity

Organizations

ShengnanAn's activity

Default system prompt can hinder thinking-mode performance

Should the thinking mode in DS v3.2 use the default system prompt?

Awesome work! Do you want to try AMO-Bench, the most challenging MO-level benchmark?

Problem 35: Official solution misuses “positive integers”, final count should be 7656 (not 7657)

Improve dataset card: Add task category, paper, project page, code, abstract, features, leaderboard, and sample usage

Problem 26 seems identical to Berkeley Math Circle 2014–2015 Monthly Contest 3, Problem 4

🎉 Free Image Generator Now Available!