4 bit MLX ?

#3
by Kiliouch - opened

I would be great to get the model in MLX format but 4bits quantized too. Especially as it would allow easier full GPU offload lower specs systems. Some 4bit Mlx quants have been released and offer better token/sec than gguf but the (super useful) reasoning effort toggle is not showing in LMStudio.

Sign up or log in to comment