--- tags: - gguf - quantized - gpt-oss - multilingual - text-generation - llama-cpp - ollama language: - en - es - fr - de - it - pt license: apache-2.0 model_type: gpt-oss pipeline_tag: text-generation base_model: openai/gpt-oss-20b --- # GPT-OSS-20B Function Calling GGUF This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama. ## Model Details - **Base Model:** openai/gpt-oss-20b - **Fine-tuning Dataset:** Salesforce/xlam-function-calling-60k (2000 samples) - **Fine-tuning Method:** LoRA (r=8, alpha=16) - **Context Length:** 131,072 tokens - **Model Size:** 20B parameters ## Files - `gpt-oss-20b-function-calling-f16.gguf`: F16 precision model (best quality) - `gpt-oss-20b-function-calling.Q4_K_M.gguf`: Q4_K_M quantized model (recommended for inference) ## Usage ### With Ollama (Recommended) ```bash # Direct from Hugging Face ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M # Or create local model ollama create my-gpt-oss -f Modelfile ollama run my-gpt-oss ``` ### With llama.cpp ```bash # Download model wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf # Run inference ./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here" ``` ### Example Modelfile for Ollama ```dockerfile FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|> <|start|>assistant<|channel|>final<|message|>""" PARAMETER temperature 0.7 PARAMETER top_p 0.9 SYSTEM """You are a helpful AI assistant that can call functions to help users.""" ``` ## PyTorch Version For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: [cuijian0819/gpt-oss-20b-function-calling](https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling) ## Performance The Q4_K_M quantized version provides excellent performance: - **Size Reduction:** ~62% smaller than F16 - **Memory Requirements:** ~16GB VRAM recommended - **Quality:** Minimal degradation from quantization ## License This model inherits the license from the base openai/gpt-oss-20b model. ## Citation ```bibtex @misc{gpt-oss-20b-function-calling-gguf, title={GPT-OSS-20B Function Calling GGUF}, author={cuijian0819}, year={2025}, url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf} } ```