Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -65,7 +65,7 @@ python -m mlx_lm generate --model halley-ai/gpt-oss-20b-MLX-5bit-gs32 \
 LM Studio / CLI (MLX, Q5 gs=32) ≈2k-token responses:
 - M1 Max (32 GB): ~45–50 tok/s, 0.40–0.60 s TTFB
-- M4 Pro (24 GB): pending
 - M3 Ultra (256 GB): pending
 Throughput varies with Mac model, context, and sampler settings.

 LM Studio / CLI (MLX, Q5 gs=32) ≈2k-token responses:
 - M1 Max (32 GB): ~45–50 tok/s, 0.40–0.60 s TTFB
+- M4 Pro (24 GB): ~65–70 tok/s, 0.25–0.45 s TTFB
 - M3 Ultra (256 GB): pending
 Throughput varies with Mac model, context, and sampler settings.