| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ActorNightly 21 days ago

If you are not being paid by Apple, I feel sorry for you. Cause that means you are so bought into the cult that you are delusional.

the 40-80 tok/sec is only for initial prompt processing, and with the "medium" models, like Qwen3.6:27b. The actual token generation is in the 10 token/second Thats very slow. And your Macbook pro will stop being a LAP-top, because it will get very warm.

Meanwhile, my 2x3090s happily crank out ~100 tok/sec generation. Oh and I can run 100 tok/sec on my phone as well, because I can just access ollama on my home desktop over ssh from termux.

1 comments

gbgarbeb 19 days ago

Please, allow me to show you how far I can pee. It's several feet more than you can and a much mightier stream to boot.

link