Supported?

Expect broken or faulty items for the time being. Use at your own discretion.

Disco

Filename Quant type File Size Description / L2 Loss Step 25 Example Image

Caesar

Combined imatrix multiple images 512x512 and 768x768, 25, 30 and 50 steps city96/flux1-dev-Q8_0 euler

data: load_imatrix: loaded 314 importance matrix entries from imatrix_caesar.dat computed on 475 chunks

Using llama.cpp quantize cae9fb4 with modified lcpp.patch.

Dynamic quantization:

  • img_in, guidance_in.in_layer, final_layer.linear: f32/bf16/f16
  • guidance_in, final_layer: bf16/f16
  • img_attn.qkv, linear1: some layers two bits up
  • txt_mod.lin, txt_mlp, txt_attn.proj: some layers one bit down

Experimental from f16

Filename Quant type File Size Description / L2 Loss Step 25 Example Image
flux1-dev-IQ1_S.gguf IQ1_S 2.41GB worst / 173 Example
flux1-dev-TQ1_0.gguf TQ1_0 2.64GB worst / 195 Example
flux1-dev-IQ1_M.gguf IQ1_M 2.72GB worst / 171 Example
flux1-dev-IQ2_XXS.gguf IQ2_XXS 3.10GB worst * / 126 Example
flux1-dev-TQ2_0.gguf TQ2_0 3.12GB worst / 202 Example
flux1-dev-IQ2_XS.gguf IQ2_XS 3.48GB worst / 140 Example
flux1-dev-IQ2_S.gguf IQ2_S 3.51GB worst / 142 Example
flux1-dev-IQ2_M.gguf IQ2_M 3.84GB bad / 120 Example
flux1-dev-Q2_K_S.gguf Q2_K_S 4.00GB ok * / 52 Example
flux1-dev-Q2_K.gguf Q2_K 4.03GB ok / 55 Example
flux1-dev-IQ3_XXS.gguf IQ3_XXS 4.56GB ok / 92 Example
flux1-dev-IQ3_XS.gguf IQ3_XS 5.05GB bad / 125 Example
flux1-dev-Q3_K_S.gguf Q3_K_S 5.10GB ok / 48 Example
flux1-dev-IQ3_S.gguf IQ3_S 5.11GB bad / 123 Example
flux1-dev-Q3_K_M.gguf Q3_K_M 5.13GB ok / 50 Example
flux1-dev-IQ3_M.gguf IQ3_M 5.14GB bad / 123 Example
flux1-dev-Q3_K_L.gguf Q3_K_L 5.17GB ok / 61 Example
flux1-dev-IQ4_XS.gguf IQ4_XS 6.33GB good / 33 Example
flux1-dev-Q4_K_S.gguf Q4_K_S 6.66GB good / 22 Example
flux1-dev-Q4_K_M.gguf Q4_K_M 6.69GB good / 21 Example
flux1-dev-IQ4_NL.gguf IQ4_NL 6.69GB good / 24 Example
flux1-dev-Q4_0.gguf Q4_0 6.81GB good / 30 Example
flux1-dev-Q4_1.gguf Q4_1 7.55GB good / 27 Example
flux1-dev-Q5_K_S.gguf Q5_K_S 8.26GB nice / 21 Example
flux1-dev-Q5_0.gguf Q5_0 8.27GB good / 30 Example
flux1-dev-Q5_K_M.gguf Q5_K_M 8.30GB nice / 23 Example
flux1-dev-Q5_1.gguf Q5_1 8.99GB nice * / 14 Example
flux1-dev-Q6_K.gguf Q6_K 9.80GB nice / 20 Example
flux1-dev-Q8_0.gguf Q8_0 12.3GB near perfect * / 8 Example
- F16 23.8GB reference Example
Filename Bits img_attn.qkv & linear1
flux1-dev-IQ1_S.gguf 333M MMMM M111 ... 11MM MM11
flux1-dev-TQ1_0.gguf 3332 2222 2111 ... 1122 2211
flux1-dev-IQ1_M.gguf 3332 2222 2111 ... 1122 2211
flux1-dev-IQ2_XXS.gguf 4433 3333 3222 ... 2222
flux1-dev-TQ2_0.gguf 3332 2222 2111 ... 1122 2211
flux1-dev-IQ2_XS.gguf 4443 3333 3222 ... 2233 3322
flux1-dev-IQ2_S.gguf 4444 4444 4444 4444 4433 3222 ... 2233 3322
flux1-dev-IQ2_M.gguf 4444 4444 4444 4444 4433 3222 ... 2223 3333 3322
flux1-dev-Q2_K_S.gguf 4443 3333 3222 ... 2222
flux1-dev-Q2_K.gguf 4443 3333 3222 ... 2233 3322
flux1-dev-IQ3_XXS.gguf 444S SSSS S333 ... 3333
flux1-dev-IQ3_XS.gguf 444S SSSS S333 ... 33SS SS33
flux1-dev-Q3_K_S.gguf 5554 4444 4333 ... 3333
flux1-dev-IQ3_S.gguf 5554 4444 4333 ... 3344 4433
flux1-dev-Q3_K_M.gguf 5554 4444 4333 ... 3344 4433
flux1-dev-IQ3_M.gguf 5554 4444 4444 4444 4433 ... 3344 4433
flux1-dev-Q3_K_L.gguf 5554 4444 4444 4444 4433 ... 3344 4433
flux1-dev-IQ4_XS.gguf 8885 5555 5444 ... 4444
flux1-dev-Q4_K_S.gguf 8885 5555 5444 ... 4444
flux1-dev-Q4_K_M.gguf 8885 5555 5555 5555 5544 ... 4444
flux1-dev-IQ4_NL.gguf 8885 5555 5555 5555 5544 ... 4444
flux1-dev-Q4_0.gguf 8885 5555 5444 ... 4444
flux1-dev-Q4_1.gguf 8885 5555 5444 ... 4444
flux1-dev-Q5_K_S.gguf FFF6 6666 6666 6666 6655 ... 5555
flux1-dev-Q5_0.gguf FFF8 8888 8555 ... 5555
flux1-dev-Q5_K_M.gguf FFF8 8888 8666 6666 6655 ... 5555
flux1-dev-Q5_1.gguf FFF8 8888 8555 ... 5555
flux1-dev-Q6_K.gguf FFF8 8888 8666 .. 6666
flux1-dev-Q8_0.gguf FFF8 8888 .. 8888

Observations

  • More imatrix data doesn't necessarily result in better quants
  • I-quants worse than same bits k-quants?

Bravo

Combined imatrix multiple images 512x512 25 and 50 steps city96/flux1-dev-Q8_0 euler

Using llama.cpp quantize cae9fb4 with modified lcpp.patch.

Experimental from f16

Filename Quant type File Size Description / L2 Loss Step 25 Example Image
flux1-dev-IQ1_S.gguf IQ1_S 2.45GB worst / 156 Example
flux1-dev-IQ1_M.gguf IQ1_M 2.72GB worst / 141 Example
flux1-dev-IQ2_XXS.gguf IQ2_XXS 3.19GB worst / 131 Example
flux1-dev-IQ2_XS.gguf IQ2_XS 3.56GB worst / 125 -
flux1-dev-IQ2_S.gguf IQ2_S 3.56GB worst / 125 -
flux1-dev-IQ2_M.gguf IQ2_M 3.93GB worst / 120 Example
flux1-dev-Q2_K_S.gguf Q2_K_S 4.02GB ok / 56 Example
flux1-dev-IQ3_XXS.gguf IQ3_XXS 4.66GB TBC / 68 Example
flux1-dev-IQ3_XS.gguf IQ3_XS 5.22GB worse than IQ3_XXS / 115 Example
flux1-dev-Q3_K_S.gguf Q3_K_S 5.22GB TBC / 34 Example
flux1-dev-IQ4_XS.gguf IQ4_XS 6.42GB TBC / 25 -
flux1-dev-Q4_0.gguf Q4_0 6.79GB TBC / 31 -
flux1-dev-IQ4_NL.gguf IQ4_NL 6.79GB TBC / 21 Example
flux1-dev-Q4_K_S.gguf Q4_K_S 6.79GB TBC / 29 Example
flux1-dev-Q4_1.gguf Q4_1 7.53GB TBC / 24 -
flux1-dev-Q5_0.gguf Q5_0 8.27GB TBC / 25 -
flux1-dev-Q5_1.gguf Q5_1 TBC TBC / 24 -
flux1-dev-Q5_K_S.gguf Q5_K_S 8.27GB TBC / 20 Example
flux1-dev-Q6_K.gguf Q6_K 9.84GB TBC / 19 Example
flux1-dev-Q8_0.gguf Q8_0 - TBC / 10 -
- F16 23.8GB reference Example

Observations

Alpha

Simple imatrix: 512x512 single image 8/20 steps city96/flux1-dev-Q3_K_S euler

data: load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks.

Using llama.cpp quantize cae9fb4 with modified lcpp.patch.

Experimental from q8

Filename Quant type File Size Description / L2 Loss Step 25 Example Image
flux1-dev-IQ1_S.gguf IQ1_S 2.45GB worst / 152 Example
- IQ1_M - broken -
flux1-dev-TQ1_0.gguf TQ1_0 2.63GB TBC -
flux1-dev-TQ2_0.gguf TQ2_0 3.19GB TBC -
flux1-dev-IQ2_XXS.gguf IQ2_XXS 3.19GB worst / 130 Example
flux1-dev-IQ2_XS.gguf IQ2_XS 3.56GB worst / 129 Example
flux1-dev-IQ2_S.gguf IQ2_S 3.56GB worst / 129 -
flux1-dev-IQ2_M.gguf IQ2_M 3.93GB worst / 121 -
flux1-dev-Q2_K.gguf Q2_K 4.02GB TBC -
flux1-dev-Q2_K_S.gguf Q2_K_S 4.02GB ok / 77 Example
flux1-dev-IQ3_XXS.gguf IQ3_XXS 4.66GB TBC Example
flux1-dev-IQ3_XS.gguf IQ3_XS 5.22GB TBC -
flux1-dev-IQ3_S.gguf IQ3_S 5.22GB TBC -
flux1-dev-IQ3_M.gguf IQ3_M 5.22GB TBC -
flux1-dev-Q3_K_S.gguf Q3_K_S 5.22GB TBC / 36 Example
flux1-dev-Q3_K_M.gguf Q3_K_M 5.36GB TBC -
flux1-dev-Q3_K_L.gguf Q3_K_L 5.36GB TBC -
flux1-dev-IQ4_XS.gguf IQ4_XS 6.42GB TBC Example
flux1-dev-IQ4_NL.gguf IQ4_NL 6.79GB TBC / 23 Example
flux1-dev-Q4_0.gguf Q4_0 6.79GB TBC -
- Q4_K TBC TBC -
flux1-dev-Q4_K_S.gguf Q4_K_S 6.79GB TBC / 26 Example
flux1-dev-Q4_K_M.gguf Q4_K_M 6.93GB TBC -
flux1-dev-Q4_1.gguf Q4_1 7.53GB TBC -
flux1-dev-Q5_K_S.gguf Q5_K_S 8.27GB TBC / 19 Example
flux1-dev-Q5_K.gguf Q5_K 8.41GB TBC -
- Q5_K_M TBC TBC -
flux1-dev-Q6_K.gguf Q6_K 9.84GB TBC -
- Q8_0 12.7GB near perfect / 10 Example
- F16 23.8GB reference Example

Observations

Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.

  • Check if lcpp_sd3.patch includes more specific quant level logic
  • Extrapolate the existing level logic
Downloads last month
4,720
GGUF
Model size
11.9B params
Architecture
flux

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Eviation/flux-imatrix

Quantized
(24)
this model