The base model is 4bit (“native MXFP4 quantization”), is it useful to upscale to 8bit?
The only gpt-oss models released today are already quantized to 4bit (MXFP4). Wouldn't 8bit imply an upscaling?
I don't have a deep understanding of the various quantization types, so I'm probably missing something.
Usual 4bit in quantization refers to INT4. The dynamic range expressible is different compared to MXFP4.
Hence, MXFP4 --> INT4 will be lossy. INT8 is also lossy, but since we know that INT8 is okay (almost lossless in terms of ppl) for models trained in fp32 or bf16, it is probably good enough.
I can't load the gpt-oss-20b-mlx model, it said
🥲 Failed to load the model
Failed to load model
Error when loading model: ValueError: Model type gpt_oss not supported.
how can i do? Does lm-studio not support this model yet? But the model was downloaded from lm-studio community.
error message in LM Studio trying to load this model:
🥲 Failed to load the model
Failed to load model
Error when loading model: ValueError: Received 264 parameters not in model:
model.layers.0.mlp.experts.down_proj.biases,
model.layers.0.mlp.experts.down_proj.weight.scales,
model.layers.0.mlp.experts.down_proj.weight.weight,
model.layers.0.mlp.experts.gate_proj.bias,
model.layers.0.mlp.experts.gate_proj.biases,
model.layers.0.mlp.experts.gate_proj.scales,
model.layers.0.mlp.experts.gate_proj.weight,
model.layers.0.mlp.experts.up_proj.bias,
model.layers.0.mlp.experts.up_proj.biases,
model.layers.0.mlp.experts.up_proj.scales,
model.layers.0.mlp.experts.up_proj.weight,
model.layers.1.mlp.experts.down_proj.biases,
model.layers.1.mlp.experts.down_proj.weight.scales,
model.layers.1.mlp.experts.down_proj.weight.weight,
model.layers.1.mlp.experts.gate_proj.bias,
model.layers.1.mlp.experts.gate_proj.biases,
model.layers.1.mlp.experts.gate_proj.scales,
model.layers.1.mlp.experts.gate_proj.weight,
model.layers.1.mlp.experts.up_proj.bias,
model.layers.1.mlp.experts.up_proj.biases,
model.layers.1.mlp.experts.up_proj.scales,
model.layers.1.mlp.experts.up_proj.weight,
model.layers.10.mlp.experts.down_proj.biases,
model.layers.10.mlp.experts.down_proj.weight.scales,
model.layers.10.mlp.experts.down_proj.weight.weight,
model.layers.10.mlp.experts.gate_proj.bias,
model.layers.10.mlp.experts.gate_proj.biases,
model.layers.10.mlp.experts.gate_proj.scales,
model.layers.10.mlp.experts.gate_proj.weight,
model.layers.10.mlp.experts.up_proj.bias,
model.layers.10.mlp.experts.up_proj.biases,
model.layers.10.mlp.experts.up_proj.scales,
model.layers.10.mlp.experts.up_proj.weight,
model.layers.11.mlp.experts.down_proj.biases,
model.layers.11.mlp.experts.down_proj.weight.scales,
model.layers.11.mlp.experts.down_proj.weight.weight,
model.layers.11.mlp.experts.gate_proj.bias,
model.layers.11.mlp.experts.gate_proj.biases,
model.layers.11.mlp.experts.gate_proj.scales,
model.layers.11.mlp.experts.gate_proj.weight,
model.layers.11.mlp.experts.up_proj.bias,
model.layers.11.mlp.experts.up_proj.biases,
model.layers.11.mlp.experts.up_proj.scales,
model.layers.11.mlp.experts.up_proj.weight,
model.layers.12.mlp.experts.down_proj.biases,
model.layers.12.mlp.experts.down_proj.weight.scales,
model.layers.12.mlp.experts.down_proj.weight.weight,
model.layers.12.mlp.experts.gate_proj.bias,
model.layers.12.mlp.experts.gate_proj.biases,
model.layers.12.mlp.experts.gate_proj.scales,
model.layers.12.mlp.experts.gate_proj.weight,
model.layers.12.mlp.experts.up_proj.bias,
model.layers.12.mlp.experts.up_proj.biases,
model.layers.12.mlp.experts.up_proj.scales,
model.layers.12.mlp.experts.up_proj.weight,
model.layers.13.mlp.experts.down_proj.biases,
model.layers.13.mlp.experts.down_proj.weight.scales,
model.layers.13.mlp.experts.down_proj.weight.weight,
model.layers.13.mlp.experts.gate_proj.bias,
model.layers.13.mlp.experts.gate_proj.biases,
model.layers.13.mlp.experts.gate_proj.scales,
model.layers.13.mlp.experts.gate_proj.weight,
model.layers.13.mlp.experts.up_proj.bias,
model.layers.13.mlp.experts.up_proj.biases,
model.layers.13.mlp.experts.up_proj.scales,
model.layers.13.mlp.experts.up_proj.weight,
model.layers.14.mlp.experts.down_proj.biases,
model.layers.14.mlp.experts.down_proj.weight.scales,
model.layers.14.mlp.experts.down_proj.weight.weight,
model.layers.14.mlp.experts.gate_proj.bias,
model.layers.14.mlp.experts.gate_proj.biases,
model.layers.14.mlp.experts.gate_proj.scales,
model.layers.14.mlp.experts.gate_proj.weight,
model.layers.14.mlp.experts.up_proj.bias,
model.layers.14.mlp.experts.up_proj.biases,
model.layers.14.mlp.experts.up_proj.scales,
model.layers.14.mlp.experts.up_proj.weight,
model.layers.15.mlp.experts.down_proj.biases,
model.layers.15.mlp.experts.down_proj.weight.scales,
model.layers.15.mlp.experts.down_proj.weight.weight,
model.layers.15.mlp.experts.gate_proj.bias,
model.layers.15.mlp.experts.gate_proj.biases,
model.layers.15.mlp.experts.gate_proj.scales,
model.layers.15.mlp.experts.gate_proj.weight,
model.layers.15.mlp.experts.up_proj.bias,
model.layers.15.mlp.experts.up_proj.biases,
model.layers.15.mlp.experts.up_proj.scales,
model.layers.15.mlp.experts.up_proj.weight,
model.layers.16.mlp.experts.down_proj.biases,
model.layers.16.mlp.experts.down_proj.weight.scales,
model.layers.16.mlp.experts.down_proj.weight.weight,
model.layers.16.mlp.experts.gate_proj.bias,
model.layers.16.mlp.experts.gate_proj.biases,
model.layers.16.mlp.experts.gate_proj.scales,
model.layers.16.mlp.experts.gate_proj.weight,
model.layers.16.mlp.experts.up_proj.bias,
model.layers.16.mlp.experts.up_proj.biases,
model.layers.16.mlp.experts.up_proj.scales,
model.layers.16.mlp.experts.up_proj.weight,
model.layers.17.mlp.experts.down_proj.biases,
model.layers.17.mlp.experts.down_proj.weight.scales,
model.layers.17.mlp.experts.down_proj.weight.weight,
model.layers.17.mlp.experts.gate_proj.bias,
model.layers.17.mlp.experts.gate_proj.biases,
model.layers.17.mlp.experts.gate_proj.scales,
model.layers.17.mlp.experts.gate_proj.weight,
model.layers.17.mlp.experts.up_proj.bias,
model.layers.17.mlp.experts.up_proj.biases,
model.layers.17.mlp.experts.up_proj.scales,
model.layers.17.mlp.experts.up_proj.weight,
model.layers.18.mlp.experts.down_proj.biases,
model.layers.18.mlp.experts.down_proj.weight.scales,
model.layers.18.mlp.experts.down_proj.weight.weight,
model.layers.18.mlp.experts.gate_proj.bias,
model.layers.18.mlp.experts.gate_proj.biases,
model.layers.18.mlp.experts.gate_proj.scales,
model.layers.18.mlp.experts.gate_proj.weight,
model.layers.18.mlp.experts.up_proj.bias,
model.layers.18.mlp.experts.up_proj.biases,
model.layers.18.mlp.experts.up_proj.scales,
model.layers.18.mlp.experts.up_proj.weight,
model.layers.19.mlp.experts.down_proj.biases,
model.layers.19.mlp.experts.down_proj.weight.scales,
model.layers.19.mlp.experts.down_proj.weight.weight,
model.layers.19.mlp.experts.gate_proj.bias,
model.layers.19.mlp.experts.gate_proj.biases,
model.layers.19.mlp.experts.gate_proj.scales,
model.layers.19.mlp.experts.gate_proj.weight,
model.layers.19.mlp.experts.up_proj.bias,
model.layers.19.mlp.experts.up_proj.biases,
model.layers.19.mlp.experts.up_proj.scales,
model.layers.19.mlp.experts.up_proj.weight,
model.layers.2.mlp.experts.down_proj.biases,
model.layers.2.mlp.experts.down_proj.weight.scales,
model.layers.2.mlp.experts.down_proj.weight.weight,
model.layers.2.mlp.experts.gate_proj.bias,
model.layers.2.mlp.experts.gate_proj.biases,
model.layers.2.mlp.experts.gate_proj.scales,
model.layers.2.mlp.experts.gate_proj.weight,
model.layers.2.mlp.experts.up_proj.bias,
model.layers.2.mlp.experts.up_proj.biases,
model.layers.2.mlp.experts.up_proj.scales,
model.layers.2.mlp.experts.up_proj.weight,
model.layers.20.mlp.experts.down_proj.biases,
model.layers.20.mlp.experts.down_proj.weight.scales,
model.layers.20.mlp.experts.down_proj.weight.weight,
model.layers.20.mlp.experts.gate_proj.bias,
model.layers.20.mlp.experts.gate_proj.biases,
model.layers.20.mlp.experts.gate_proj.scales,
model.layers.20.mlp.experts.gate_proj.weight,
model.layers.20.mlp.experts.up_proj.bias,
model.layers.20.mlp.experts.up_proj.biases,
model.layers.20.mlp.experts.up_proj.scales,
model.layers.20.mlp.experts.up_proj.weight,
model.layers.21.mlp.experts.down_proj.biases,
model.layers.21.mlp.experts.down_proj.weight.scales,
model.layers.21.mlp.experts.down_proj.weight.weight,
model.layers.21.mlp.experts.gate_proj.bias,
model.layers.21.mlp.experts.gate_proj.biases,
model.layers.21.mlp.experts.gate_proj.scales,
model.layers.21.mlp.experts.gate_proj.weight,
model.layers.21.mlp.experts.up_proj.bias,
model.layers.21.mlp.experts.up_proj.biases,
model.layers.21.mlp.experts.up_proj.scales,
model.layers.21.mlp.experts.up_proj.weight,
model.layers.22.mlp.experts.down_proj.biases,
model.layers.22.mlp.experts.down_proj.weight.scales,
model.layers.22.mlp.experts.down_proj.weight.weight,
model.layers.22.mlp.experts.gate_proj.bias,
model.layers.22.mlp.experts.gate_proj.biases,
model.layers.22.mlp.experts.gate_proj.scales,
model.layers.22.mlp.experts.gate_proj.weight,
model.layers.22.mlp.experts.up_proj.bias,
model.layers.22.mlp.experts.up_proj.biases,
model.layers.22.mlp.experts.up_proj.scales,
model.layers.22.mlp.experts.up_proj.weight,
model.layers.23.mlp.experts.down_proj.biases,
model.layers.23.mlp.experts.down_proj.weight.scales,
model.layers.23.mlp.experts.down_proj.weight.weight,
model.layers.23.mlp.experts.gate_proj.bias,
model.layers.23.mlp.experts.gate_proj.biases,
model.layers.23.mlp.experts.gate_proj.scales,
model.layers.23.mlp.experts.gate_proj.weight,
model.layers.23.mlp.experts.up_proj.bias,
model.layers.23.mlp.experts.up_proj.biases,
model.layers.23.mlp.experts.up_proj.scales,
model.layers.23.mlp.experts.up_proj.weight,
model.layers.3.mlp.experts.down_proj.biases,
model.layers.3.mlp.experts.down_proj.weight.scales,
model.layers.3.mlp.experts.down_proj.weight.weight,
model.layers.3.mlp.experts.gate_proj.bias,
model.layers.3.mlp.experts.gate_proj.biases,
model.layers.3.mlp.experts.gate_proj.scales,
model.layers.3.mlp.experts.gate_proj.weight,
model.layers.3.mlp.experts.up_proj.bias,
model.layers.3.mlp.experts.up_proj.biases,
model.layers.3.mlp.experts.up_proj.scales,
model.layers.3.mlp.experts.up_proj.weight,
model.layers.4.mlp.experts.down_proj.biases,
model.layers.4.mlp.experts.down_proj.weight.scales,
model.layers.4.mlp.experts.down_proj.weight.weight,
model.layers.4.mlp.experts.gate_proj.bias,
model.layers.4.mlp.experts.gate_proj.biases,
model.layers.4.mlp.experts.gate_proj.scales,
model.layers.4.mlp.experts.gate_proj.weight,
model.layers.4.mlp.experts.up_proj.bias,
model.layers.4.mlp.experts.up_proj.biases,
model.layers.4.mlp.experts.up_proj.scales,
model.layers.4.mlp.experts.up_proj.weight,
model.layers.5.mlp.experts.down_proj.biases,
model.layers.5.mlp.experts.down_proj.weight.scales,
model.layers.5.mlp.experts.down_proj.weight.weight,
model.layers.5.mlp.experts.gate_proj.bias,
model.layers.5.mlp.experts.gate_proj.biases,
model.layers.5.mlp.experts.gate_proj.scales,
model.layers.5.mlp.experts.gate_proj.weight,
model.layers.5.mlp.experts.up_proj.bias,
model.layers.5.mlp.experts.up_proj.biases,
model.layers.5.mlp.experts.up_proj.scales,
model.layers.5.mlp.experts.up_proj.weight,
model.layers.6.mlp.experts.down_proj.biases,
model.layers.6.mlp.experts.down_proj.weight.scales,
model.layers.6.mlp.experts.down_proj.weight.weight,
model.layers.6.mlp.experts.gate_proj.bias,
model.layers.6.mlp.experts.gate_proj.biases,
model.layers.6.mlp.experts.gate_proj.scales,
model.layers.6.mlp.experts.gate_proj.weight,
model.layers.6.mlp.experts.up_proj.bias,
model.layers.6.mlp.experts.up_proj.biases,
model.layers.6.mlp.experts.up_proj.scales,
model.layers.6.mlp.experts.up_proj.weight,
model.layers.7.mlp.experts.down_proj.biases,
model.layers.7.mlp.experts.down_proj.weight.scales,
model.layers.7.mlp.experts.down_proj.weight.weight,
model.layers.7.mlp.experts.gate_proj.bias,
model.layers.7.mlp.experts.gate_proj.biases,
model.layers.7.mlp.experts.gate_proj.scales,
model.layers.7.mlp.experts.gate_proj.weight,
model.layers.7.mlp.experts.up_proj.bias,
model.layers.7.mlp.experts.up_proj.biases,
model.layers.7.mlp.experts.up_proj.scales,
model.layers.7.mlp.experts.up_proj.weight,
model.layers.8.mlp.experts.down_proj.biases,
model.layers.8.mlp.experts.down_proj.weight.scales,
model.layers.8.mlp.experts.down_proj.weight.weight,
model.layers.8.mlp.experts.gate_proj.bias,
model.layers.8.mlp.experts.gate_proj.biases,
model.layers.8.mlp.experts.gate_proj.scales,
model.layers.8.mlp.experts.gate_proj.weight,
model.layers.8.mlp.experts.up_proj.bias,
model.layers.8.mlp.experts.up_proj.biases,
model.layers.8.mlp.experts.up_proj.scales,
model.layers.8.mlp.experts.up_proj.weight,
model.layers.9.mlp.experts.down_proj.biases,
model.layers.9.mlp.experts.down_proj.weight.scales,
model.layers.9.mlp.experts.down_proj.weight.weight,
model.layers.9.mlp.experts.gate_proj.bias,
model.layers.9.mlp.experts.gate_proj.biases,
model.layers.9.mlp.experts.gate_proj.scales,
model.layers.9.mlp.experts.gate_proj.weight,
model.layers.9.mlp.experts.up_proj.bias,
model.layers.9.mlp.experts.up_proj.biases,
model.layers.9.mlp.experts.up_proj.scales,
model.layers.9.mlp.experts.up_proj.weight.