Probing LLMs for Joint Encoding of Linguistic Categories Paper โข 2310.18696 โข Published Oct 28, 2023 โข 1
How far can bias go? -- Tracing bias from pretraining data to alignment Paper โข 2411.19240 โข Published Nov 28, 2024
CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models Paper โข 2405.13974 โข Published May 22, 2024 โข 9