Hacker News new | ask | show | jobs
by andrewmunsell 566 days ago
Another data point:

17.6 tokens/s on an M4 Max 40 core GPU