Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations Paper • 2503.06987 • Published Mar 10 • 1
MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language Paper • 2505.14395 • Published May 20 • 6
Survey of Cultural Awareness in Language Models: Text and Beyond Paper • 2411.00860 • Published Oct 30, 2024 • 25