YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit

Repository: ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit

This is a 4-bit SGPTQ quantized version of OpenBuddy/OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT, built for efficient inference with reduced memory and compute requirements.

Model Details

  • Base model: OpenBuddy-R1-0528-Distill-Qwen3-32B-Preview0-QAT
  • Quantization: SGPTQ, 4-bit
  • Format: GPTQ
  • Precision: INT4 (NF4 or compatible)
  • Use case: Chatbot, general-purpose LLM tasks
  • Target hardware: GPU inference with GPTQ-supported libraries (e.g., exllama, gptq-for-llama, vLLM with GPTQ)

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("ramgpt/openbuddy-r1-0528-distill-qwen3-32b-preview0-qat-gptq-4bit", device_map="auto")
Downloads last month
11
Safetensors
Model size
5.74B params
Tensor type
I32
·
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support