base_model: Qwen/Qwen3-4B | |
[EXL3](https://github.com/turboderp-org/exllamav3) quantization of [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B), 4 bits per weight. | |
### HumanEval (argmax) | |
| Model | Q4 | Q6 | Q8 | FP16 | | |
| ---------------------------------------------------------------------------- | ----- | ----- | ----- | ----- | | |
| [Qwen3-4B-exl3-4bpw](https://huggingface.co/isogen/Qwen3-4B-exl3-4bpw) | 80.5% | 81.1% | 81.7% | 80.5% | | |
| [Qwen3-4B-exl3-6bpw](https://huggingface.co/isogen/Qwen3-4B-exl3-6bpw) | 80.5% | 85.4% | 86.0% | 86.0% | | |
| [Qwen3-4B-exl3-8bpw-h8](https://huggingface.co/isogen/Qwen3-4B-exl3-8bpw-h8) | 82.3% | 84.8% | 83.5% | 82.9% | | |