Running 1.38k 1.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
deepseek-ai/DeepSeek-R1-Distill-Llama-70B Text Generation • Updated 15 days ago • 472k • • 580
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • Apr 24, 2024 • 61
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3, 2024 • 54