Hacker News new | ask | show | jobs
by lzaborowski 106 days ago
One thing I’ve noticed with local models is that people tolerate a lot more trial and error behavior. When a hosted model wastes tokens it feels expensive, but when a local model loops a bit it just feels like it’s “thinking.”

If models like Qwen can get good enough for coding tasks locally, the real shift might be economic rather than purely capability.

1 comments

Wasted tokens are preferred for local models, I need the GPU mainframe in my bedroom to heat it as I live in a third world country with unreliable heating (Switzerland).