Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10, 2024 • 32
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 13 days ago • 134
A Long Way to Go: Investigating Length Correlations in RLHF Paper • 2310.03716 • Published Oct 5, 2023 • 10