cuijian0819's picture
Add comprehensive README
a71a80e verified
metadata
tags:
  - gguf
  - quantized
  - gpt-oss
  - multilingual
  - text-generation
  - llama-cpp
  - ollama
language:
  - en
  - es
  - fr
  - de
  - it
  - pt
license: apache-2.0
model_type: gpt-oss
pipeline_tag: text-generation
base_model: openai/gpt-oss-20b

GPT-OSS-20B Function Calling GGUF

This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama.

Model Details

  • Base Model: openai/gpt-oss-20b
  • Fine-tuning Dataset: Salesforce/xlam-function-calling-60k (2000 samples)
  • Fine-tuning Method: LoRA (r=8, alpha=16)
  • Context Length: 131,072 tokens
  • Model Size: 20B parameters

Files

  • gpt-oss-20b-function-calling-f16.gguf: F16 precision model (best quality)
  • gpt-oss-20b-function-calling.Q4_K_M.gguf: Q4_K_M quantized model (recommended for inference)

Usage

With Ollama (Recommended)

# Direct from Hugging Face
ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M

# Or create local model
ollama create my-gpt-oss -f Modelfile
ollama run my-gpt-oss

With llama.cpp

# Download model
wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf

# Run inference
./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here"

Example Modelfile for Ollama

FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf

TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|>
<|start|>assistant<|channel|>final<|message|>"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9

SYSTEM """You are a helpful AI assistant that can call functions to help users."""

PyTorch Version

For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: cuijian0819/gpt-oss-20b-function-calling

Performance

The Q4_K_M quantized version provides excellent performance:

  • Size Reduction: ~62% smaller than F16
  • Memory Requirements: ~16GB VRAM recommended
  • Quality: Minimal degradation from quantization

License

This model inherits the license from the base openai/gpt-oss-20b model.

Citation

@misc{gpt-oss-20b-function-calling-gguf,
  title={GPT-OSS-20B Function Calling GGUF},
  author={cuijian0819},
  year={2025},
  url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf}
}