GPT-OSS-20B Function Calling GGUF

This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama.

Model Details

  • Base Model: openai/gpt-oss-20b
  • Fine-tuning Dataset: Salesforce/xlam-function-calling-60k (2000 samples)
  • Fine-tuning Method: LoRA (r=8, alpha=16)
  • Context Length: 131,072 tokens
  • Model Size: 20B parameters

Files

  • gpt-oss-20b-function-calling-f16.gguf: F16 precision model (best quality)
  • gpt-oss-20b-function-calling.Q4_K_M.gguf: Q4_K_M quantized model (recommended for inference)

Usage

With Ollama (Recommended)

# Direct from Hugging Face
ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M

# Or create local model
ollama create my-gpt-oss -f Modelfile
ollama run my-gpt-oss

With llama.cpp

# Download model
wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf

# Run inference
./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here"

Example Modelfile for Ollama

FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf

TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|>
<|start|>assistant<|channel|>final<|message|>"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9

SYSTEM """You are a helpful AI assistant that can call functions to help users."""

PyTorch Version

For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: cuijian0819/gpt-oss-20b-function-calling

Performance

The Q4_K_M quantized version provides excellent performance:

  • Size Reduction: ~62% smaller than F16
  • Memory Requirements: ~16GB VRAM recommended
  • Quality: Minimal degradation from quantization

License

This model inherits the license from the base openai/gpt-oss-20b model.

Citation

@misc{gpt-oss-20b-function-calling-gguf,
  title={GPT-OSS-20B Function Calling GGUF},
  author={cuijian0819},
  year={2025},
  url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf}
}
Downloads last month
-
GGUF
Model size
20.9B params
Architecture
gpt-oss
Hardware compatibility
Log In to view the estimation

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cuijian0819/gpt-oss-20b-function-calling-gguf

Base model

openai/gpt-oss-20b
Quantized
(73)
this model