Hacker News new | ask | show | jobs
by gbgarbeb 26 days ago
40-80 tok/s is unusable to you? Ok.

If you like having a box with 8-12 fans blasting hot air and noise into your office all day, nobody's stopping you.

1 comments

If you are not being paid by Apple, I feel sorry for you. Cause that means you are so bought into the cult that you are delusional.

the 40-80 tok/sec is only for initial prompt processing, and with the "medium" models, like Qwen3.6:27b. The actual token generation is in the 10 token/second Thats very slow. And your Macbook pro will stop being a LAP-top, because it will get very warm.

Meanwhile, my 2x3090s happily crank out ~100 tok/sec generation. Oh and I can run 100 tok/sec on my phone as well, because I can just access ollama on my home desktop over ssh from termux.

Please, allow me to show you how far I can pee. It's several feet more than you can and a much mightier stream to boot.