view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models By AI-MO and 18 others • Jul 10 • 49
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 193
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_8b-table-0.002 Text Generation • 8B • Updated Sep 29, 2024 • 4
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_8b-table-0.002 Text Generation • 8B • Updated Sep 28, 2024 • 8
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 7
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 9