HandH1998
/

QQQ-Llama-3-8b-g128

Text Generation

text-generation-inference

Model card Files Files and versions

QQQ-Llama-3-8b-g128 / README.md

HandH1998's picture

Update README.md

616b410 verified about 1 year ago

|

history blame contribute delete

315 Bytes

	---
	license: mit
	---
	This is the INT4 Llama-3-8b model quantized by per-group QQQ and the group size is 128. QQQ is an innovative and hardware-optimized W4A8 quantization solution. For more details, please refer to our code [repo](https://github.com/HandH1998/QQQ) and our [paper](https://arxiv.org/pdf/2406.09904).