🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 12 items • Updated 3 days ago • 79
DistilBERT release Collection Original DistilBERT model, checkpoints obtained from using teacher-student learning from the original BERT checkpoints. • 6 items • Updated Apr 17, 2024 • 19
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis Paper • 2309.12792 • Published Sep 22, 2023 • 1
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24, 2024 • 47
Zurich 1.5B (GGUF) Collection Quantized versions of Zurich 1.5B Model Collection, compatible with llama.cpp. Quantized by mradermacher - Fine-tuned from Qwen 2.5 14B Instruct • 12 items • Updated 8 days ago • 2
Geneva 12B (GGUF) Collection Quantized versions of Geneva 12B Model Collection, compatible with llama.cpp. Quantized by mradermacher - Fine-tuned from Mistral NeMo Instruct 2407 • 12 items • Updated 8 days ago • 2
Zurich 14B (GGUF) Collection Quantized versions of Zurich 14B Model Collection, compatible with llama.cpp. Quantized by mradermacher - Fine-tuned from Qwen 2.5 14B Instruct • 12 items • Updated 8 days ago • 3
Zurich 7B (GGUF) Collection Quantized versions of Zurich 7B Model Collection, compatible with llama.cpp. Quantized by mradermacher - Fine-tuned from Qwen 2.5 7B Instruct • 12 items • Updated 8 days ago • 3
Zurich 1.5B Collection The Zurich 1.5B Model Collection - Fine-tuned from Qwen 2.5 1.5B Instruct with GammaCorpus v2. • 6 items • Updated 19 days ago • 2
GammaCorpus (CoT) Collection The GammaCorpus Dataset Collection for CoT (Chain of Thought) • 1 item • Updated 19 days ago • 9
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 25 days ago • 23
Video Generation models Collection The domain of video generation is booming. Here are the list of selected Open Access video generation (T2V) models. • 14 items • Updated Aug 27, 2024 • 14