sebastavar commited on
Commit
3b80908
·
verified ·
1 Parent(s): 382c3b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -65,7 +65,7 @@ python -m mlx_lm generate --model halley-ai/gpt-oss-20b-MLX-5bit-gs32 \
65
 
66
  LM Studio / CLI (MLX, Q5 gs=32) ≈2k-token responses:
67
  - M1 Max (32 GB): ~45–50 tok/s, 0.40–0.60 s TTFB
68
- - M4 Pro (24 GB): pending
69
  - M3 Ultra (256 GB): pending
70
 
71
  Throughput varies with Mac model, context, and sampler settings.
 
65
 
66
  LM Studio / CLI (MLX, Q5 gs=32) ≈2k-token responses:
67
  - M1 Max (32 GB): ~45–50 tok/s, 0.40–0.60 s TTFB
68
+ - M4 Pro (24 GB): ~65–70 tok/s, 0.25–0.45 s TTFB
69
  - M3 Ultra (256 GB): pending
70
 
71
  Throughput varies with Mac model, context, and sampler settings.