See our collection for versions of Deepseek-R1 including GGUF & 4-bit formats.

Unsloth's r1-1776 2-bit Dynamic Quants is selectively quantized, greatly improving accuracy over standard 1-bit/2-bit.

Finetune your own Reasoning model like R1 with Unsloth!

We have a free Google Colab notebook for turning Llama 3.1 (8B) into a reasoning model: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb

✨ Finetune for Free

All notebooks are beginner friendly! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.

Unsloth supports Free Notebooks Performance Memory use
GRPO with Phi-4 (14B) ▶️ Start on Colab 2x faster 80% less
Llama-3.2 (3B) ▶️ Start on Colab 2.4x faster 58% less
Llama-3.2 (11B vision) ▶️ Start on Colab 2x faster 60% less
Qwen2 VL (7B) ▶️ Start on Colab 1.8x faster 60% less
Qwen2.5 (7B) ▶️ Start on Colab 2x faster 60% less
Llama-3.1 (8B) ▶️ Start on Colab 2.4x faster 58% less
Phi-3.5 (mini) ▶️ Start on Colab 2x faster 50% less
Gemma 2 (9B) ▶️ Start on Colab 2.4x faster 58% less
Mistral (7B) ▶️ Start on Colab 2.2x faster 62% less

R1 1776 Distill Llama 70B

Blog link: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

This is a Llama 70B distilled version of R1 1776.

R1 1776 is a DeepSeek-R1 reasoning model that has been post-trained by Perplexity AI to remove Chinese Communist Party censorship. The model provides unbiased, accurate, and factual information while maintaining high reasoning capabilities.

Evals

To ensure our model remains fully “uncensored” and capable of engaging with a broad spectrum of sensitive topics, we curated a diverse, multilingual evaluation set of over a 1000 of examples that comprehensively cover such subjects. We then use human annotators as well as carefully designed LLM judges to measure the likelihood a model will evade or provide overly sanitized responses to the queries.

We also ensured that the model’s math and reasoning abilities remained intact after the decensoring process. Evaluations on multiple benchmarks showed that our post-trained model performed on par with the base R1 model, indicating that the decensoring had no impact on its core reasoning capabilities.

Benchmark R1-Distill-Llama-70B R1-1776-Distill-Llama-70B
China Censorship 80.53 0.2
Internal Benchmarks (avg) 47.64 48.4
AIME 2024 70 70
MATH-500 94.5 94.8
MMLU 88.52 * 88.40
DROP 84.55 * 84.83
GPQA 65.2 65.05

* Evaluated by Perplexity AI since they were not reported in the paper.

Downloads last month
23
Safetensors
Model size
38.2B params
Tensor type
F32
·
BF16
·
U8
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for unsloth/r1-1776-distill-llama-70b-unsloth-bnb-4bit