Jointly Reinforcing Diversity and Quality in Language Model Generations Paper • 2509.02534 • Published 5 days ago • 23
StepWiser: Stepwise Generative Judges for Wiser Reasoning Paper • 2508.19229 • Published 12 days ago • 19
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge Paper • 2407.19594 • Published Jul 28, 2024 • 21
System 2 Attention (is something you might need too) Paper • 2311.11829 • Published Nov 20, 2023 • 44
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Paper • 2310.15123 • Published Oct 23, 2023 • 8
Chain-of-Verification Reduces Hallucination in Large Language Models Paper • 2309.11495 • Published Sep 20, 2023 • 39
Leveraging Implicit Feedback from Deployment Data in Dialogue Paper • 2307.14117 • Published Jul 26, 2023 • 5