Running 1.38k 1.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published 11 days ago • 53
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 11 days ago • 139