Long Context - 16k,32k,64k,128k,200k,256k,512k,1000k Collection Q6/Q8 models here. Mixtrals/Mistral (and merges) generally have 32k context (not listed here) . Please see org model card for usage / templates. • 69 items • Updated 7 days ago • 10
Open-source speech datasets annotated using Data-Speech Collection Open-source annotated speech datasets ranging from 1,000 hours to 45,000 hours. • 11 items • Updated Aug 8, 2024 • 5
DeepSeek-R1-ReDistill Collection Re-distilled DeepSeek R1 models • 4 items • Updated 25 days ago • 14
Sana Collection ⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated 13 days ago • 88
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 13 days ago • 70
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 357
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Dec 13, 2024 • 330
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Dec 13, 2024 • 145