Running 1.36k 1.36k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 26 days ago • 106
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 91
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 360
Chinese LLM Leaderboard best models Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 24 items • Updated 12 days ago • 6