8bits quantization

#20
by ramkumarkoppu - opened

Hi @Unsloth team for the great work. Can you please provide the instructions if I want to quantize to 8bits locally on my linux system where model weights downloaded from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main to reproduce these quantized model files in the directory DeepSeek-R1-Q8_0 locally

Unsloth AI org

The R1 model is already 8bit by default :)

I am confused more, the model weights in the repo https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main
tells me differently
image.png

The large matrices are fp8. Specifically F8_e4m3

Unsloth AI org

so, what @unsloth team done to create https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-Q8_0 from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main ? what are the reproduction steps?

We had to convert it to bf16 then convert it to GGUF

Also see: https://unsloth.ai/blog/deepseekr1-dynamic

Sign up or log in to comment