Pythia Deduped Series GGML

This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.

For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).

Last updated on 2023-05-25.

For other versions of the models, see here:

Description:

  • The motivation behind these quantizations was that the LLaMA series lacks sizes below 7B, whereas it was the norm for older models to be available in as little as ~125M parameters. This makes it uncomfortable to run on hardware with less than 4GB of RAM, even with 2-bit quantization.

RAM USAGE

Model RAM usage
Unloaded 41.3 MiB
ggmlv3-pythia-70m-deduped-q4_0.bin 95.5 MiB
ggmlv3-pythia-160m-deduped-q4_0.bin 201.1 MiB
ggmlv3-pythia-410m-deduped-q4_0.bin 415.1 MiB
ggmlv3-pythia-1b-deduped-q4_0.bin 762.2 MiB
ggmlv3-pythia-1.4b-deduped-q4_0.bin 1.0 GiB
ggmlv3-pythia-2.8b-deduped-q4_0.bin 1.9 GiB
ggmlv3-pythia-70m-deduped-q5_1.bin 108.7 MiB
ggmlv3-pythia-160m-deduped-q5_1.bin 226.9 MiB
ggmlv3-pythia-410m-deduped-q5_1.bin 494.0 MiB
ggmlv3-pythia-1b-deduped-q5_1.bin 943.9 MiB
ggmlv3-pythia-1.4b-deduped-q5_1.bin 1.3 GiB
ggmlv3-pythia-2.8b-deduped-q5_1.bin 2.3 GiB

Tested on KoboldCpp with OpenBLAS enabled.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train Crataco/Pythia-Deduped-Series-GGML