metadata

license: mit
datasets:
  - fka/awesome-chatgpt-prompts
language:
  - en
metrics:
  - accuracy
base_model:
  - Qwen/Qwen-Image
new_version: openai/gpt-oss-120b
pipeline_tag: token-classification
library_name: fastai
tags:
  - code

Using a Trained Mini-GPT Model (Safetensors)

This guide explains how to load a trained Mini-GPT model saved in safetensors format and generate text using it.
It is written in a step-by-step manner for learning and understanding.

1️⃣ Install Required Packages

Make sure you have the necessary packages:

pip install torch transformers safetensors

2️⃣ Load the Trained Model and Tokenizer

We saved our model earlier in ./mini_gpt_safetensor. Here’s how to load it:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "./mini_gpt_safetensor"  # Path to your saved model

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)
tokenizer.pad_token = tokenizer.eos_token  # GPT models don't have pad_token

# Load model
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")  # auto uses GPU if available

Note: Using device_map="auto" will load the model on GPU if available, otherwise CPU.

3️⃣ Generate Text from a Prompt

Once the model is loaded, we can generate text using a simple function:

def generate_text(prompt, max_length=50):
    # Tokenize prompt
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids
    input_ids = input_ids.to(model.device)

    # Generate text
    output_ids = model.generate(
        input_ids,
        max_length=max_length,
        do_sample=True,   # enable randomness
        top_k=50,         # sample from top 50 tokens
        top_p=0.95,       # nucleus sampling
        temperature=0.7,  # creativity factor
        num_return_sequences=1
    )

    # Decode output
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    return output_text

Tip:

do_sample=True → random outputs for creativity
top_k and top_p → control sampling probability
temperature → higher value = more creative output

4️⃣ Test Text Generation

Use your function with any prompt:

prompt = "Hello, I am training a mini GPT model"
generated_text = generate_text(prompt, max_length=50)

print("\n📝 Generated text:")
print(generated_text)

Example output:

Hello, I am training a mini GPT model to generate simple sentences about Python, deep learning, and AI projects.

✅ Summary

Load the tokenizer and model from the safetensors folder.
Use generate with proper sampling parameters for creative text.
Decode the output to get readable text.
You can experiment with prompt, max_length, top_k, top_p, and temperature to control text generation.
By following this MDX guide, you can easily load any trained Mini-GPT model and generate text interactively.