Converted to MLX 4bit quantisation for you
#5
by
overhead520
- opened
Thanks a lot for this great model!
Converted to MLX 4bit quantisation for you. It's about 20% faster on any Apple Silicon machine than a equivalent Q4_K_M GGUF.
https://huggingface.co/mlx-community/L3.3-MS-Nevoria-70b-4bit