LLM-as-a-judge - a JM-Brun Collection

JM-Brun 's Collections

Prompt Optimization

Tabular

Agents

SLMs

LLM-KG

LLM Architecture

Interpretability XAI

LLM-as-a-judge

updated Jul 29

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Paper • 2502.01534 • Published Feb 3 • 42
Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published Feb 6 • 34
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives

Paper • 2504.10823 • Published Apr 15 • 16
CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Paper • 2507.18392 • Published Jul 24 • 19