A Tale of Tails: Model Collapse as a Change of Scaling Laws
Paper
•
2402.07043
•
Published
•
15
What Characterizes Effective Reasoning? Revisiting Length, Review, and
Structure of CoT
Paper
•
2509.19284
•
Published
•
22
OnePiece: Bringing Context Engineering and Reasoning to Industrial
Cascade Ranking System
Paper
•
2509.18091
•
Published
•
33
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
Paper
•
2509.18058
•
Published
•
12
Igniting Creative Writing in Small Language Models: LLM-as-a-Judge
versus Multi-Agent Refined Rewards
Paper
•
2508.21476
•
Published
•
3
Competition Report: Finding Universal Jailbreak Backdoors in Aligned
LLMs
Paper
•
2404.14461
•
Published
•
3
Universal Jailbreak Backdoors from Poisoned Human Feedback
Paper
•
2311.14455
•
Published
•
3
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?
Paper
•
2510.02209
•
Published
•
53
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming
Attacks
Paper
•
2510.02286
•
Published
•
28