view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • about 1 month ago • 58
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 460
Arcee's MergeKit: A Toolkit for Merging Large Language Models Paper • 2403.13257 • Published Mar 20, 2024 • 20
ReMM series Collection Models based on MythoMax with updated base models. • 4 items • Updated Oct 12, 2023 • 10