view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 13 days ago • 50
view article Article ChatML vs Harmony: Understanding the new Format from OpenAI 🔍 By kuotient • 12 days ago • 24
MolmoAct Collection All models for the MolmoAct (Multimodal Open Language Model for Action) release. • 8 items • Updated 5 days ago • 20
MolmoAct Data Mixture Collection All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. • 3 items • Updated 6 days ago • 11
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention By lwtr and 5 others • Aug 21, 2024 • 40
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 42
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated 10 days ago • 218
view article Article Improving Parquet Dedupe on Hugging Face Hub By yuchenglow and 1 other • Oct 5, 2024 • 38
view article Article TimeScope: How Long Can Your Video Large Multimodal Model Go? By orrzohar and 3 others • 29 days ago • 37
view article Article Fast LoRA inference for Flux with Diffusers and PEFT By sayakpaul and 1 other • 29 days ago • 44
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1, 2024 • 16