DeepSeek-R1 Family
Collection
7 items
•
Updated
BitsAndBytes 4 bits quantization from DeepSeek-R1-Distill-Qwen-7B commit 393119fcd6a873e5776c79b0db01c96911f5f0fc
Tested successfully with vLLM 0.7.2 with the following parameters:
llm_model = LLM(
"MPWARE/DeepSeek-R1-Distill-Qwen-7B-BnB-4bits",
task="generate",
dtype=torch.bfloat16,
max_num_seqs=8192,
max_model_len=8192,
trust_remote_code=True,
quantization="bitsandbytes",
load_format="bitsandbytes",
enforce_eager=True, # Required for vLLM architecture V1
tensor_parallel_size=1,
gpu_memory_utilization=0.95,
seed=42
)
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B