unsloth
/

r1-1776-GGUF

@@ -15,7 +15,7 @@ tags:
     <strong>See <a href="https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5">our collection</a> for versions of Deepseek-R1 including GGUF & 4-bit formats.</strong>
   </p>
   <p style="margin-bottom: 0;">
-    <em>Unsloth's DeepSeek-R1 <a href="https://unsloth.ai/blog/deepseekr1-dynamic">1.58-bit + 2-bit Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 1-bit/2-bit.</em>
   </p>
   <div style="display: flex; gap: 5px; align-items: center; ">
     <a href="https://github.com/unslothai/unsloth/">
@@ -53,7 +53,7 @@ cp llama.cpp/build/bin/llama-* llama.cpp
 from huggingface_hub import snapshot_download
 snapshot_download(
   repo_id = "unsloth/r1-1776-GGUF",
-  local_dir = "r1-1776-GGU",
   allow_patterns = ["*Q4_K_M*"], # Select quant type Q4_K_M for 4.5bit
 )
 ```
@@ -100,10 +100,8 @@ snapshot_download(
 | MoE Bits     | Type   | Disk Size |  Accuracy | Link                      | Details   |
 | -------- | -------- | ------------ | ------------ | ---------------------|  ---------- |
-| 1.58bit | UD-IQ1_S |   **131GB**    | Fair           | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S) | MoE all 1.56bit. `down_proj` in MoE mixture of 2.06/1.56bit |
-| 1.73bit | UD-IQ1_M |   **158GB**    | Good | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_M) | MoE all 1.56bit. `down_proj` in MoE left at 2.06bit |
-| 2.22bit | UD-IQ2_XXS |   **183GB**    | Better      | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ2_XXS) | MoE all 2.06bit. `down_proj` in MoE mixture of 2.5/2.06bit |
-| 2.51bit | UD-Q2_K_XL |   **212GB**    | Best | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-Q2_K_XL) | MoE all 2.5bit. `down_proj` in MoE mixture of 3.5/2.5bit |
 # Finetune your own Reasoning model like R1 with Unsloth!
 We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb

     <strong>See <a href="https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5">our collection</a> for versions of Deepseek-R1 including GGUF & 4-bit formats.</strong>
   </p>
   <p style="margin-bottom: 0;">
+    <em>Unsloth's r1-1776 <a href="https://unsloth.ai/blog/deepseekr1-dynamic">2-bit Dynamic Quants</a> is selectively quantized, greatly improving accuracy over standard 1-bit/2-bit.</em>
   </p>
   <div style="display: flex; gap: 5px; align-items: center; ">
     <a href="https://github.com/unslothai/unsloth/">
 from huggingface_hub import snapshot_download
 snapshot_download(
   repo_id = "unsloth/r1-1776-GGUF",
+  local_dir = "r1-1776-GGUF",
   allow_patterns = ["*Q4_K_M*"], # Select quant type Q4_K_M for 4.5bit
 )
 ```
 | MoE Bits     | Type   | Disk Size |  Accuracy | Link                      | Details   |
 | -------- | -------- | ------------ | ------------ | ---------------------|  ---------- |
+| 2.22bit | UD-IQ2_XXS |   **183GB**    | Better      | [Link](https://huggingface.co/unsloth/r1-1776-GGUF/tree/main/r1-1776-UD-IQ2_XXS) | MoE all 2.06bit. `down_proj` in MoE mixture of 2.5/2.06bit |
+| 2.51bit | UD-Q2_K_XL |   **212GB**    | Best | [Link](https://huggingface.co/unsloth/r1-1776-GGUF/tree/main/r1-1776-UD-Q2_K_XL) | MoE all 2.5bit. `down_proj` in MoE mixture of 3.5/2.5bit |
 # Finetune your own Reasoning model like R1 with Unsloth!
 We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb