Feature request: Release 4-bit DWQ variants for Qwen3-Coder (480B-A35B)

#2
by wake6 - opened

Hi MLX team,

I’d love to use 4-bit DWQ versions of:

Qwen3-Coder (480B-A35B)

Why: On Apple Silicon, DWQ typically recovers a noticeable amount of 4-bit quality without extra runtime cost, which is perfect for long-context coding and agent workflows.
My setup: Mac Studio (M3 Ultra, 512 GB unified memory), MLX via mlx_lm.

If there’s an internal DWQ workflow you’d like the community to try (e.g., mlx_lm.dwq with specific datasets/lengths), I’m happy to follow it and provide feedback/results.

Thanks for considering this!

Note: Since MLX has been improving DWQ & dynamic-quant (e.g., KL loss and memory updates), exposing official DWQ 4-bit artifacts for these large MoE models would be super helpful to many Apple Silicon users.

Sign up or log in to comment