Hacker News new | ask | show | jobs
by mhitza 320 days ago
That M4 Max is really something else, I get also 70 tokens/second on eval on a RTX 4000 SFF Ada server GPU.