Converted to MLX 4bit quantisation for you

by overhead520 - opened Feb 22

Feb 22

Thanks a lot for this great model!

Converted to MLX 4bit quantisation for you. It's about 20% faster on any Apple Silicon machine than a equivalent Q4_K_M GGUF.
https://huggingface.co/mlx-community/L3.3-MS-Nevoria-70b-4bit

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment