isogen
/

Qwen3-0.6B-exl3-8bpw-h8

8-bit precision

Model card Files Files and versions

Qwen3-0.6B-exl3-8bpw-h8 / README.md

isogen's picture

Upload folder using huggingface_hub

1769f91 verified 4 months ago

|

history blame contribute delete

731 Bytes

	---
	base_model: Qwen/Qwen3-0.6B
	---

	[EXL3](https://github.com/turboderp-org/exllamav3) quantization of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), 8 bits per weight, including output layers.

	### HumanEval (argmax)

	\| Model \| Q4 \| Q6 \| Q8 \| FP16 \|
	\| ------------------------------------------------------------------------------------------ \| ---- \| ----- \| ----- \| ----- \|
	\| [Qwen3-0.6B-exl3-8bpw-h8](https://huggingface.co/isogen/Qwen3-0.6B-exl3-8bpw-h8) \| 0.0% \| 38.4% \| 40.9% \| 40.2% \|
	\| [Qwen3-0.6B-Base-exl3-8bpw-h8](https://huggingface.co/isogen/Qwen3-0.6B-Base-exl3-8bpw-h8) \| 0.0% \| 36.0% \| 37.2% \| 36.6% \|