Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 7 days ago • 51
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 3 days ago • 87
Step-Audio Collection Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated 6 days ago • 26
Hamanasu Collection A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay. • 9 items • Updated 9 days ago • 4
OLMoE (January 2025) Collection Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app • 10 items • Updated 12 days ago • 9
SFTvsRL Models & Data Collection This collection contains 4 initial checkpoints for https://github.com/LeslieTrue/SFTvsRL and necessary data for V-IRL training. • 5 items • Updated 19 days ago • 8
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 26 days ago • 106
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 16 days ago • 194
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 360
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 28 days ago • 100
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated 17 days ago • 92
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated 4 days ago • 57