13 41 79

Seungone Kim PRO

seungone

https://seungonekim.github.io/

AI & ML interests

Large Language Models, LLM-as-a-Judge, Reward Model Overoptimization, Personalized Alignment

Recent Activity

upvoted a paper 22 days ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

authored a paper 3 months ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

authored a paper 3 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

View all activity

Organizations

upvoted a paper 22 days ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Paper • 2603.18886 • Published 23 days ago • 6

authored 5 papers 3 months ago

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Paper • 2505.23840 • Published May 28, 2025 • 3

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Paper • 2508.13141 • Published Aug 18, 2025

VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding

Paper • 2509.21451 • Published Sep 25, 2025

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

liked a dataset 3 months ago

proxima-fusion/constellaration

Viewer • Updated Oct 28, 2025 • 930k • 2.07k • 19

liked a dataset 4 months ago

facebook/principia-bench

Viewer • Updated Dec 18, 2025 • 2.24k • 304 • 19

authored a paper 4 months ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 15

upvoted a paper 4 months ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 15

commented a paper 4 months ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 15 •

liked a dataset 5 months ago

RefineBench/RefineBench

Viewer • Updated Dec 2, 2025 • 1k • 1.63k • 5

updated a dataset 5 months ago

facebook/principia-collection

Viewer • Updated Dec 19, 2025 • 554k • 389 • 44

liked a dataset 5 months ago

facebook/principia-collection

Viewer • Updated Dec 19, 2025 • 554k • 389 • 44

published a dataset 5 months ago

facebook/principia-collection

Viewer • Updated Dec 19, 2025 • 554k • 389 • 44

upvoted a paper 5 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

liked 2 datasets 9 months ago

toloka/u-math

Viewer • Updated Jan 30 • 1.1k • 211 • 26

xw27/scibench

Viewer • Updated May 6, 2024 • 692 • 1.79k • 24

upvoted a paper 9 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

upvoted a paper 10 months ago

Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28, 2025 • 8

Seungone Kim PRO

AI & ML interests

Recent Activity

Organizations

seungone's activity