Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding Paper • 2502.05609 • Published 15 days ago • 15 • 3