8bits quantization

#20

by ramkumarkoppu - opened 17 days ago

Discussion

ramkumarkoppu

17 days ago

•

edited 17 days ago

Hi @Unsloth team for the great work. Can you please provide the instructions if I want to quantize to 8bits locally on my linux system where model weights downloaded from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main to reproduce these quantized model files in the directory DeepSeek-R1-Q8_0 locally

shimmyshimmer

Unsloth AI org 15 days ago

The R1 model is already 8bit by default :)

ramkumarkoppu

15 days ago

I am confused more, the model weights in the repo https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main
tells me differently

TobDeBer

15 days ago

The large matrices are fp8. Specifically F8_e4m3

ramkumarkoppu

15 days ago

so, what @unsloth team done to create https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-Q8_0 from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main ? what are the reproduction steps?

shimmyshimmer

Unsloth AI org 10 days ago

so, what @unsloth team done to create https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-Q8_0 from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main ? what are the reproduction steps?

We had to convert it to bf16 then convert it to GGUF

Also see: https://unsloth.ai/blog/deepseekr1-dynamic

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment