isogen
/

Qwen3-4B-exl3-4bpw

4-bit precision

Model card Files Files and versions

Qwen3-4B-exl3-4bpw / README.md

isogen's picture

Upload folder using huggingface_hub

b07e437 verified 4 months ago

|

history blame contribute delete

761 Bytes

	---
	base_model: Qwen/Qwen3-4B
	---

	[EXL3](https://github.com/turboderp-org/exllamav3) quantization of [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B), 4 bits per weight.

	### HumanEval (argmax)

	\| Model \| Q4 \| Q6 \| Q8 \| FP16 \|
	\| ---------------------------------------------------------------------------- \| ----- \| ----- \| ----- \| ----- \|
	\| [Qwen3-4B-exl3-4bpw](https://huggingface.co/isogen/Qwen3-4B-exl3-4bpw) \| 80.5% \| 81.1% \| 81.7% \| 80.5% \|
	\| [Qwen3-4B-exl3-6bpw](https://huggingface.co/isogen/Qwen3-4B-exl3-6bpw) \| 80.5% \| 85.4% \| 86.0% \| 86.0% \|
	\| [Qwen3-4B-exl3-8bpw-h8](https://huggingface.co/isogen/Qwen3-4B-exl3-8bpw-h8) \| 82.3% \| 84.8% \| 83.5% \| 82.9% \|