YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This repo contains serialized blobs of an up projection layer of llama3-8B (oc=14336, ic=4096). The linear layer has been quantized (GPTQ W4 Sym with group size 32) and sparsified by 50%.

β”œβ”€β”€ sparse_w4
β”‚   β”œβ”€β”€ linear_bitmap_int32.bin
β”‚   β”œβ”€β”€ linear_compressed_qweight_int32.bin
β”‚   β”œβ”€β”€ linear_nnz_int16.bin
β”‚   β”œβ”€β”€ linear_scales_float16.bin
β”‚   └── linear_zeros_int32.bin

Usage

The following script shows how to process the blobs in python. It shows unpacking, zero location recovery, as well as weight dequantization process.

python unpack_blobs.py

you can ignore internal/

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.