Squeezed Attention: Accelerating Long Context Length LLM Inference Paper • 2411.09688 • Published Nov 14, 2024 • 1
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation Paper • 2512.05033 • Published 24 days ago • 15