GPT-OSS-20B Function Calling GGUF

This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama.

Model Details

Base Model: openai/gpt-oss-20b
Fine-tuning Dataset: Salesforce/xlam-function-calling-60k (2000 samples)
Fine-tuning Method: LoRA (r=8, alpha=16)
Context Length: 131,072 tokens
Model Size: 20B parameters

Files

gpt-oss-20b-function-calling-f16.gguf: F16 precision model (best quality)
gpt-oss-20b-function-calling.Q4_K_M.gguf: Q4_K_M quantized model (recommended for inference)

Usage

With Ollama (Recommended)

# Direct from Hugging Face
ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M

# Or create local model
ollama create my-gpt-oss -f Modelfile
ollama run my-gpt-oss

With llama.cpp

# Download model
wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf

# Run inference
./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here"

Example Modelfile for Ollama

FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf

TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|>
<|start|>assistant<|channel|>final<|message|>"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9

SYSTEM """You are a helpful AI assistant that can call functions to help users."""

PyTorch Version

For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: cuijian0819/gpt-oss-20b-function-calling

Performance

The Q4_K_M quantized version provides excellent performance:

Size Reduction: ~62% smaller than F16
Memory Requirements: ~16GB VRAM recommended
Quality: Minimal degradation from quantization

License

This model inherits the license from the base openai/gpt-oss-20b model.

Citation

@misc{gpt-oss-20b-function-calling-gguf,
  title={GPT-OSS-20B Function Calling GGUF},
  author={cuijian0819},
  year={2025},
  url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf}
}

cuijian0819
/

gpt-oss-20b-function-calling-gguf