pikoGPT-16M

The first and the smallest version of GPT-2 trained for fun/education.

Not intended to be a proper model, just a pathfinder for me to learn.

Training

Trained on a single 3090 for ~20k steps with Karpathy's llm.c train_gpt2.py script. Dataset used is edu_fineweb10B from the aforementioned repo.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train pagarsky/pikoGPT-16M