Running 1.37k 1.37k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 7 days ago • 133
Step-Audio Collection Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated 6 days ago • 26
Lucy-in-the-Sky/Mistral-Small-24B-Instruct-2501-reasoning-Q6_K-GGUF Text Generation • Updated 6 days ago • 67 • 1
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation • Updated 3 days ago • 884 • 43
bartowski/SicariusSicariiStuff_Phi-lthy4-GGUF Text Generation • Updated 11 days ago • 2.04k • 6