Skywork-SWE-32B GPTQ 4bit (vLLM-ready)
This is a 4-bit GPTQ quantized version of Skywork/Skywork-SWE-32B, compatible with vLLM.
- Quantization: GPTQ (4-bit)
- Group size: 128
- Format: GPTQModel (custom loader)
- Dtype: float16
Usage with vLLM
vllm serve ./skywork-swe-32b-gptqmodel-4bit \
--quantization gptq \
--dtype half \
--max-model-len 5900
Credits
- Base model: Skywork/Skywork-SWE-32B
- Quantized by: ramgpt