Quantized to MXFP4 using llama.cpp b6150
I also gguf-my-repo'd a q6_K on my personal account but it's actually not any better and it's too big. looks like llama.cpp quantization is bad quality for these models except for mxfp4
- Downloads last month
- 562
Hardware compatibility
Log In
to view the estimation
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support