4 bit MLX ?

by Kiliouch - opened 15 days ago

15 days ago

I would be great to get the model in MLX format but 4bits quantized too. Especially as it would allow easier full GPU offload lower specs systems. Some 4bit Mlx quants have been released and offer better token/sec than gguf but the (super useful) reasoning effort toggle is not showing in LMStudio.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment