Tulu3 with distraction mitigation data Collection LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract • 5 items • Updated Oct 30, 2025 • 2
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense Paper • 2510.16259 • Published Oct 17, 2025 • 3
FiSCo: Evaluating LLM's Group Level Fairness Collection Generated Questions for group fairness evaluation • 6 items • Updated Oct 6, 2025 • 2
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey Paper • 2402.17944 • Published Feb 27, 2024 • 2
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning Paper • 2505.08054 • Published May 12, 2025 • 3
SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions Paper • 2506.00643 • Published May 31, 2025 • 6
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective Paper • 2506.19028 • Published Jun 23, 2025 • 4
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs Paper • 2510.09905 • Published Oct 10, 2025 • 6