YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit
Repository: ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit
This is a 4-bit SGPTQ quantized version of OpenBuddy/OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT, built for efficient inference with reduced memory and compute requirements.
Model Details
- Base model: OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT
- Quantization: SGPTQ, 4-bit
- Format: GPTQ
- Precision: INT4 (NF4 or compatible)
- Use case: Chatbot, general-purpose LLM tasks
- Target hardware: GPU inference with GPTQ-supported libraries (e.g.,
exllama
,gptq-for-llama
,vLLM
with GPTQ)
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit", device_map="auto")
- Downloads last month
- 11
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support