Running 1.38k 1.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Running 1.38k 1.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping Paper • 2409.15241 • Published Sep 23, 2024 • 1
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 256
view post Post 3321 Native tensor parallel has landed in transformers!!! https://github.com/huggingface/transformers/pull/34184 thanks a lot to the torch team for their support! Contributions are welcome to support more models! 🔥 🔥 13 13 ❤️ 5 5 🤯 3 3 🤝 3 3 + Reply