Hacker News new | ask | show | jobs
by dust42 3 hours ago
With a M5 16c 48GB and Qwen 3.6 35B Q4 I get up to 1900 PP/s and 80 TG/s. With an Nvidia 5090 I get 7800 PP/s and 280 TG/s.

Together with pi mono I wouldn't want to go back to Claude & Co. Speed, quality of the answers, short answer times at any time of day - once you have eaten from the fruit your definition of SOTA will change...

For reference, I do software development since 30 years, I am not vibe coding the umpteenth todo list.