Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition Paper • 2404.08008 • Published Apr 10, 2024 • 1
ERPO: Advancing Safety Alignment via Ex-Ante Reasoning Preference Optimization Paper • 2504.02725 • Published Apr 3 • 1