QQQ-Llama-3-8b-g128 / README.md
HandH1998's picture
Update README.md
616b410 verified
---
license: mit
---
This is the INT4 Llama-3-8b model quantized by per-group QQQ and the group size is 128. QQQ is an innovative and hardware-optimized W4A8 quantization solution. For more details, please refer to our code [repo](https://github.com/HandH1998/QQQ) and our [paper](https://arxiv.org/pdf/2406.09904).