Hacker News new | ask | show | jobs
by j0hnyl 572 days ago
How many tokens per second?
3 comments

Another data point:

17.6 tokens/s on an M4 Max 40 core GPU

I am away from my computer, but I think it was about 10/second - not too bad.
8.4 tps on M1 Pro chip with 32GM RAM (Q4 model, 18GB).