shimmyshimmer commited on
Commit
2e9b93d
·
verified ·
1 Parent(s): 57b702f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -6
README.md CHANGED
@@ -15,7 +15,7 @@ tags:
15
  <strong>See <a href="https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5">our collection</a> for versions of Deepseek-R1 including GGUF & 4-bit formats.</strong>
16
  </p>
17
  <p style="margin-bottom: 0;">
18
- <em>Unsloth's DeepSeek-R1 <a href="https://unsloth.ai/blog/deepseekr1-dynamic">1.58-bit + 2-bit Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 1-bit/2-bit.</em>
19
  </p>
20
  <div style="display: flex; gap: 5px; align-items: center; ">
21
  <a href="https://github.com/unslothai/unsloth/">
@@ -53,7 +53,7 @@ cp llama.cpp/build/bin/llama-* llama.cpp
53
  from huggingface_hub import snapshot_download
54
  snapshot_download(
55
  repo_id = "unsloth/r1-1776-GGUF",
56
- local_dir = "r1-1776-GGU",
57
  allow_patterns = ["*Q4_K_M*"], # Select quant type Q4_K_M for 4.5bit
58
  )
59
  ```
@@ -100,10 +100,8 @@ snapshot_download(
100
 
101
  | MoE Bits | Type | Disk Size | Accuracy | Link | Details |
102
  | -------- | -------- | ------------ | ------------ | ---------------------| ---------- |
103
- | 1.58bit | UD-IQ1_S | **131GB** | Fair | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S) | MoE all 1.56bit. `down_proj` in MoE mixture of 2.06/1.56bit |
104
- | 1.73bit | UD-IQ1_M | **158GB** | Good | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_M) | MoE all 1.56bit. `down_proj` in MoE left at 2.06bit |
105
- | 2.22bit | UD-IQ2_XXS | **183GB** | Better | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ2_XXS) | MoE all 2.06bit. `down_proj` in MoE mixture of 2.5/2.06bit |
106
- | 2.51bit | UD-Q2_K_XL | **212GB** | Best | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-Q2_K_XL) | MoE all 2.5bit. `down_proj` in MoE mixture of 3.5/2.5bit |
107
 
108
  # Finetune your own Reasoning model like R1 with Unsloth!
109
  We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb
 
15
  <strong>See <a href="https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5">our collection</a> for versions of Deepseek-R1 including GGUF & 4-bit formats.</strong>
16
  </p>
17
  <p style="margin-bottom: 0;">
18
+ <em>Unsloth's r1-1776 <a href="https://unsloth.ai/blog/deepseekr1-dynamic">2-bit Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 1-bit/2-bit.</em>
19
  </p>
20
  <div style="display: flex; gap: 5px; align-items: center; ">
21
  <a href="https://github.com/unslothai/unsloth/">
 
53
  from huggingface_hub import snapshot_download
54
  snapshot_download(
55
  repo_id = "unsloth/r1-1776-GGUF",
56
+ local_dir = "r1-1776-GGUF",
57
  allow_patterns = ["*Q4_K_M*"], # Select quant type Q4_K_M for 4.5bit
58
  )
59
  ```
 
100
 
101
  | MoE Bits | Type | Disk Size | Accuracy | Link | Details |
102
  | -------- | -------- | ------------ | ------------ | ---------------------| ---------- |
103
+ | 2.22bit | UD-IQ2_XXS | **183GB** | Better | [Link](https://huggingface.co/unsloth/r1-1776-GGUF/tree/main/r1-1776-UD-IQ2_XXS) | MoE all 2.06bit. `down_proj` in MoE mixture of 2.5/2.06bit |
104
+ | 2.51bit | UD-Q2_K_XL | **212GB** | Best | [Link](https://huggingface.co/unsloth/r1-1776-GGUF/tree/main/r1-1776-UD-Q2_K_XL) | MoE all 2.5bit. `down_proj` in MoE mixture of 3.5/2.5bit |
 
 
105
 
106
  # Finetune your own Reasoning model like R1 with Unsloth!
107
  We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb