unsloth
/

gpt-oss-20b-unsloth-bnb-4bit

@@ -1,10 +1,44 @@
 ---
 license: apache-2.0
 pipeline_tag: text-generation
 library_name: transformers
 tags:
-- vllm
 ---
 <p align="center">
   <img alt="gpt-oss-20b" src="https://raw.githubusercontent.com/openai/gpt-oss/main/docs/gpt-oss-20b.svg">
@@ -13,7 +47,7 @@ tags:
 <p align="center">
   <a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
   <a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
-  <a href="https://openai.com/index/gpt-oss-model-card"><strong>Model card</strong></a> ·
   <a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
 </p>
@@ -21,8 +55,8 @@ tags:
 Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
-We’re releasing two flavors of these open models:
-- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fit into a single H100 GPU (117B parameters with 5.1B active parameters)
 - `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
 Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.

 ---
+base_model:
+- openai/gpt-oss-20b
 license: apache-2.0
 pipeline_tag: text-generation
 library_name: transformers
 tags:
+- openai
+- unsloth
 ---
+<div>
+  <p style="margin-bottom: 0; margin-top: 0;">
+    <strong>See <a href="https://huggingface.co/collections/unsloth/gpt-oss-6892433695ce0dee42f31681">our collection</a> for all versions of gpt-oss including GGUF, 4-bit & 16-bit formats.</strong>
+  </p>
+  <p style="margin-bottom: 0;">
+    <em>Learn to run gpt-oss correctly - <a href="https://docs.unsloth.ai/basics/gpt-oss">Read our Guide</a>.</em>
+  </p>
+<p style="margin-top: 0;margin-bottom: 0;">
+   <em>See <a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0 GGUFs</a> for our quantization benchmarks.</em>
+  </p>
+  <div style="display: flex; gap: 5px; align-items: center; ">
+    <a href="https://github.com/unslothai/unsloth/">
+      <img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="133">
+    </a>
+    <a href="https://discord.gg/unsloth">
+      <img src="https://github.com/unslothai/unsloth/raw/main/images/Discord%20button.png" width="173">
+    </a>
+    <a href="https://docs.unsloth.ai/basics/gpt-oss">
+      <img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="143">
+    </a>
+  </div>
+<h1 style="margin-top: 0rem;">✨ Read our gpt-oss Guide <a href="https://docs.unsloth.ai/basics/gpt-oss">here</a>!</h1>
+</div>
+- Read our Blog about gpt-oss support: [unsloth.ai/blog/gpt-oss](https://unsloth.ai/blog/gpt-oss)
+- View the rest of our notebooks in our [docs here](https://docs.unsloth.ai/get-started/unsloth-notebooks).
+- Thank you to the [llama.cpp](https://github.com/ggml-org/llama.cpp) team for their work on supporting this model. We wouldn't be able to release quants without them!
+The F32 quant is MXFP4 upcasted to BF16 for every single layer and is unquantized.
+# gpt-oss-20b Details
 <p align="center">
   <img alt="gpt-oss-20b" src="https://raw.githubusercontent.com/openai/gpt-oss/main/docs/gpt-oss-20b.svg">
 <p align="center">
   <a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
   <a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
+  <a href="https://openai.com/index/gpt-oss-model-card"><strong>System card</strong></a> ·
   <a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
 </p>
 Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
+We’re releasing two flavors of the open models:
+- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fits into a single H100 GPU (117B parameters with 5.1B active parameters)
 - `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
 Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.