Hacker News new | ask | show | jobs
by mswphd 58 days ago
An "obvious" point to make is that it is not particularly usable on a unified memory machine. Only getting 9 tok/s (for Q6 quants) using a Macbook M4 Pro 48GB memory (though with GGUFs, not mlx).

The quality seems fine, but the 9 tok/s mean I only tried it out briefly.