Feature request: Release 4-bit DWQ variants for Qwen3-Coder (480B-A35B)
#2
by
wake6
- opened
Hi MLX team,
I’d love to use 4-bit DWQ versions of:
Qwen3-Coder (480B-A35B)
Why: On Apple Silicon, DWQ typically recovers a noticeable amount of 4-bit quality without extra runtime cost, which is perfect for long-context coding and agent workflows.
My setup: Mac Studio (M3 Ultra, 512 GB unified memory), MLX via mlx_lm.
If there’s an internal DWQ workflow you’d like the community to try (e.g., mlx_lm.dwq with specific datasets/lengths), I’m happy to follow it and provide feedback/results.
Thanks for considering this!
Note: Since MLX has been improving DWQ & dynamic-quant (e.g., KL loss and memory updates), exposing official DWQ 4-bit artifacts for these large MoE models would be super helpful to many Apple Silicon users.