Supported?
Expect broken or faulty items for the time being. Use at your own discretion.
- ComfyUI-GGUF: all? (CPU/CUDA)
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K
- Slow dequant: others via GGUF/NumPy
- Forge: TBC
- stable-diffusion.cpp: llama.cpp Feature-matrix
- CPU: all
- Cuda: all?
- Vulkan: >= Q3_K_S, > IQ4_S; PR IQ1_S, IQ1_M PR IQ4_XS
- other: ?
Disco
Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
---|
Caesar
Combined imatrix multiple images 512x512 and 768x768, 25, 30 and 50 steps city96/flux1-dev-Q8_0 euler
data: load_imatrix: loaded 314 importance matrix entries from imatrix_caesar.dat computed on 475 chunks
Using llama.cpp quantize cae9fb4 with modified lcpp.patch.
Dynamic quantization:
- img_in, guidance_in.in_layer, final_layer.linear: f32/bf16/f16
- guidance_in, final_layer: bf16/f16
- img_attn.qkv, linear1: some layers two bits up
- txt_mod.lin, txt_mlp, txt_attn.proj: some layers one bit down
Experimental from f16
Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
---|---|---|---|---|
flux1-dev-IQ1_S.gguf | IQ1_S | 2.41GB | worst / 173 | Example |
flux1-dev-TQ1_0.gguf | TQ1_0 | 2.64GB | worst / 195 | Example |
flux1-dev-IQ1_M.gguf | IQ1_M | 2.72GB | worst / 171 | Example |
flux1-dev-IQ2_XXS.gguf | IQ2_XXS | 3.10GB | worst * / 126 | Example |
flux1-dev-TQ2_0.gguf | TQ2_0 | 3.12GB | worst / 202 | Example |
flux1-dev-IQ2_XS.gguf | IQ2_XS | 3.48GB | worst / 140 | Example |
flux1-dev-IQ2_S.gguf | IQ2_S | 3.51GB | worst / 142 | Example |
flux1-dev-IQ2_M.gguf | IQ2_M | 3.84GB | bad / 120 | Example |
flux1-dev-Q2_K_S.gguf | Q2_K_S | 4.00GB | ok * / 52 | Example |
flux1-dev-Q2_K.gguf | Q2_K | 4.03GB | ok / 55 | Example |
flux1-dev-IQ3_XXS.gguf | IQ3_XXS | 4.56GB | ok / 92 | Example |
flux1-dev-IQ3_XS.gguf | IQ3_XS | 5.05GB | bad / 125 | Example |
flux1-dev-Q3_K_S.gguf | Q3_K_S | 5.10GB | ok / 48 | Example |
flux1-dev-IQ3_S.gguf | IQ3_S | 5.11GB | bad / 123 | Example |
flux1-dev-Q3_K_M.gguf | Q3_K_M | 5.13GB | ok / 50 | Example |
flux1-dev-IQ3_M.gguf | IQ3_M | 5.14GB | bad / 123 | Example |
flux1-dev-Q3_K_L.gguf | Q3_K_L | 5.17GB | ok / 61 | Example |
flux1-dev-IQ4_XS.gguf | IQ4_XS | 6.33GB | good / 33 | Example |
flux1-dev-Q4_K_S.gguf | Q4_K_S | 6.66GB | good / 22 | Example |
flux1-dev-Q4_K_M.gguf | Q4_K_M | 6.69GB | good / 21 | Example |
flux1-dev-IQ4_NL.gguf | IQ4_NL | 6.69GB | good / 24 | Example |
flux1-dev-Q4_0.gguf | Q4_0 | 6.81GB | good / 30 | Example |
flux1-dev-Q4_1.gguf | Q4_1 | 7.55GB | good / 27 | Example |
flux1-dev-Q5_K_S.gguf | Q5_K_S | 8.26GB | nice / 21 | Example |
flux1-dev-Q5_0.gguf | Q5_0 | 8.27GB | good / 30 | Example |
flux1-dev-Q5_K_M.gguf | Q5_K_M | 8.30GB | nice / 23 | Example |
flux1-dev-Q5_1.gguf | Q5_1 | 8.99GB | nice * / 14 | Example |
flux1-dev-Q6_K.gguf | Q6_K | 9.80GB | nice / 20 | Example |
flux1-dev-Q8_0.gguf | Q8_0 | 12.3GB | near perfect * / 8 | Example |
- | F16 | 23.8GB | reference | Example |
Filename | Bits img_attn.qkv & linear1 |
---|---|
flux1-dev-IQ1_S.gguf | 333M MMMM M111 ... 11MM MM11 |
flux1-dev-TQ1_0.gguf | 3332 2222 2111 ... 1122 2211 |
flux1-dev-IQ1_M.gguf | 3332 2222 2111 ... 1122 2211 |
flux1-dev-IQ2_XXS.gguf | 4433 3333 3222 ... 2222 |
flux1-dev-TQ2_0.gguf | 3332 2222 2111 ... 1122 2211 |
flux1-dev-IQ2_XS.gguf | 4443 3333 3222 ... 2233 3322 |
flux1-dev-IQ2_S.gguf | 4444 4444 4444 4444 4433 3222 ... 2233 3322 |
flux1-dev-IQ2_M.gguf | 4444 4444 4444 4444 4433 3222 ... 2223 3333 3322 |
flux1-dev-Q2_K_S.gguf | 4443 3333 3222 ... 2222 |
flux1-dev-Q2_K.gguf | 4443 3333 3222 ... 2233 3322 |
flux1-dev-IQ3_XXS.gguf | 444S SSSS S333 ... 3333 |
flux1-dev-IQ3_XS.gguf | 444S SSSS S333 ... 33SS SS33 |
flux1-dev-Q3_K_S.gguf | 5554 4444 4333 ... 3333 |
flux1-dev-IQ3_S.gguf | 5554 4444 4333 ... 3344 4433 |
flux1-dev-Q3_K_M.gguf | 5554 4444 4333 ... 3344 4433 |
flux1-dev-IQ3_M.gguf | 5554 4444 4444 4444 4433 ... 3344 4433 |
flux1-dev-Q3_K_L.gguf | 5554 4444 4444 4444 4433 ... 3344 4433 |
flux1-dev-IQ4_XS.gguf | 8885 5555 5444 ... 4444 |
flux1-dev-Q4_K_S.gguf | 8885 5555 5444 ... 4444 |
flux1-dev-Q4_K_M.gguf | 8885 5555 5555 5555 5544 ... 4444 |
flux1-dev-IQ4_NL.gguf | 8885 5555 5555 5555 5544 ... 4444 |
flux1-dev-Q4_0.gguf | 8885 5555 5444 ... 4444 |
flux1-dev-Q4_1.gguf | 8885 5555 5444 ... 4444 |
flux1-dev-Q5_K_S.gguf | FFF6 6666 6666 6666 6655 ... 5555 |
flux1-dev-Q5_0.gguf | FFF8 8888 8555 ... 5555 |
flux1-dev-Q5_K_M.gguf | FFF8 8888 8666 6666 6655 ... 5555 |
flux1-dev-Q5_1.gguf | FFF8 8888 8555 ... 5555 |
flux1-dev-Q6_K.gguf | FFF8 8888 8666 .. 6666 |
flux1-dev-Q8_0.gguf | FFF8 8888 .. 8888 |
Observations
- More imatrix data doesn't necessarily result in better quants
- I-quants worse than same bits k-quants?
Bravo
Combined imatrix multiple images 512x512 25 and 50 steps city96/flux1-dev-Q8_0 euler
Using llama.cpp quantize cae9fb4 with modified lcpp.patch.
Experimental from f16
Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
---|---|---|---|---|
flux1-dev-IQ1_S.gguf | IQ1_S | 2.45GB | worst / 156 | Example |
flux1-dev-IQ1_M.gguf | IQ1_M | 2.72GB | worst / 141 | Example |
flux1-dev-IQ2_XXS.gguf | IQ2_XXS | 3.19GB | worst / 131 | Example |
flux1-dev-IQ2_XS.gguf | IQ2_XS | 3.56GB | worst / 125 | - |
flux1-dev-IQ2_S.gguf | IQ2_S | 3.56GB | worst / 125 | - |
flux1-dev-IQ2_M.gguf | IQ2_M | 3.93GB | worst / 120 | Example |
flux1-dev-Q2_K_S.gguf | Q2_K_S | 4.02GB | ok / 56 | Example |
flux1-dev-IQ3_XXS.gguf | IQ3_XXS | 4.66GB | TBC / 68 | Example |
flux1-dev-IQ3_XS.gguf | IQ3_XS | 5.22GB | worse than IQ3_XXS / 115 | Example |
flux1-dev-Q3_K_S.gguf | Q3_K_S | 5.22GB | TBC / 34 | Example |
flux1-dev-IQ4_XS.gguf | IQ4_XS | 6.42GB | TBC / 25 | - |
flux1-dev-Q4_0.gguf | Q4_0 | 6.79GB | TBC / 31 | - |
flux1-dev-IQ4_NL.gguf | IQ4_NL | 6.79GB | TBC / 21 | Example |
flux1-dev-Q4_K_S.gguf | Q4_K_S | 6.79GB | TBC / 29 | Example |
flux1-dev-Q4_1.gguf | Q4_1 | 7.53GB | TBC / 24 | - |
flux1-dev-Q5_0.gguf | Q5_0 | 8.27GB | TBC / 25 | - |
flux1-dev-Q5_1.gguf | Q5_1 | TBC | TBC / 24 | - |
flux1-dev-Q5_K_S.gguf | Q5_K_S | 8.27GB | TBC / 20 | Example |
flux1-dev-Q6_K.gguf | Q6_K | 9.84GB | TBC / 19 | Example |
flux1-dev-Q8_0.gguf | Q8_0 | - | TBC / 10 | - |
- | F16 | 23.8GB | reference | Example |
Observations
- Bravo IQ1_S worse than Alpha?
- Latent loss
- Per layer quantization cost from chrisgoringe/casting_cost
- Per layer quantization cost 2 from Freepik/flux.1-lite-8B: double blocks and single blocks
- Ablation latent loss per weight type
- Pareto front loss vs. size
Alpha
Simple imatrix: 512x512 single image 8/20 steps city96/flux1-dev-Q3_K_S euler
data: load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks
.
Using llama.cpp quantize cae9fb4 with modified lcpp.patch.
Experimental from q8
Filename | Quant type | File Size | Description / L2 Loss Step 25 | Example Image |
---|---|---|---|---|
flux1-dev-IQ1_S.gguf | IQ1_S | 2.45GB | worst / 152 | Example |
- | IQ1_M | - | broken | - |
flux1-dev-TQ1_0.gguf | TQ1_0 | 2.63GB | TBC | - |
flux1-dev-TQ2_0.gguf | TQ2_0 | 3.19GB | TBC | - |
flux1-dev-IQ2_XXS.gguf | IQ2_XXS | 3.19GB | worst / 130 | Example |
flux1-dev-IQ2_XS.gguf | IQ2_XS | 3.56GB | worst / 129 | Example |
flux1-dev-IQ2_S.gguf | IQ2_S | 3.56GB | worst / 129 | - |
flux1-dev-IQ2_M.gguf | IQ2_M | 3.93GB | worst / 121 | - |
flux1-dev-Q2_K.gguf | Q2_K | 4.02GB | TBC | - |
flux1-dev-Q2_K_S.gguf | Q2_K_S | 4.02GB | ok / 77 | Example |
flux1-dev-IQ3_XXS.gguf | IQ3_XXS | 4.66GB | TBC | Example |
flux1-dev-IQ3_XS.gguf | IQ3_XS | 5.22GB | TBC | - |
flux1-dev-IQ3_S.gguf | IQ3_S | 5.22GB | TBC | - |
flux1-dev-IQ3_M.gguf | IQ3_M | 5.22GB | TBC | - |
flux1-dev-Q3_K_S.gguf | Q3_K_S | 5.22GB | TBC / 36 | Example |
flux1-dev-Q3_K_M.gguf | Q3_K_M | 5.36GB | TBC | - |
flux1-dev-Q3_K_L.gguf | Q3_K_L | 5.36GB | TBC | - |
flux1-dev-IQ4_XS.gguf | IQ4_XS | 6.42GB | TBC | Example |
flux1-dev-IQ4_NL.gguf | IQ4_NL | 6.79GB | TBC / 23 | Example |
flux1-dev-Q4_0.gguf | Q4_0 | 6.79GB | TBC | - |
- | Q4_K | TBC | TBC | - |
flux1-dev-Q4_K_S.gguf | Q4_K_S | 6.79GB | TBC / 26 | Example |
flux1-dev-Q4_K_M.gguf | Q4_K_M | 6.93GB | TBC | - |
flux1-dev-Q4_1.gguf | Q4_1 | 7.53GB | TBC | - |
flux1-dev-Q5_K_S.gguf | Q5_K_S | 8.27GB | TBC / 19 | Example |
flux1-dev-Q5_K.gguf | Q5_K | 8.41GB | TBC | - |
- | Q5_K_M | TBC | TBC | - |
flux1-dev-Q6_K.gguf | Q6_K | 9.84GB | TBC | - |
- | Q8_0 | 12.7GB | near perfect / 10 | Example |
- | F16 | 23.8GB | reference | Example |
Observations
Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.
- Check if lcpp_sd3.patch includes more specific quant level logic
- Extrapolate the existing level logic
- Downloads last month
- 4,720
Model tree for Eviation/flux-imatrix
Base model
black-forest-labs/FLUX.1-dev