| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by heipei 1 hour ago
	Depends on what you mean by "local". On your Macbook, large dense models like Qwen 3.6 27B will be slow, sure. On a local workstation with a dedicated RTX card you can get > 100 tps, which is more than good enough to work with it, and faster than cloud models in many cases.

2 comments

jstanley 1 hour ago

But how smart is it? All the people running local models never seem to mention that they are way dumber than cloud models.

I don't care how many tokens per second of nonsense it can generate.

link

throwawayffffas 14 minutes ago

Qwen 3.6 35b a3b is about as good as sonnet 4.5. It varies but it's at that level.

link

notnullorvoid 38 minutes ago

Quantized Gemma 4 26B is as smart or better than GPT 5 in most of my testing. Granted GPT 5 is nearly a year old at this point, but I can run Gemma 4 on a ~6 year old consumer GPU (RTX 3090) and get 140 t/s.

link

heipei 1 hour ago

It is smart enough that I use for all my coding tasks, and a lot of other mundane tasks.

It is probably not smart enough for "design this whole architecture of this complex system from scratch, make no mistakes", but that is not something I want from a coding tool anyway. I want a model that I can point to a file and tell it to make some changes to the file and related files. Or that I can ask to review a PR with regards to certain aspects.

My suggestion is to simply try it and see what it feels like.

link

myaccountonhn 1 hour ago

Its not going to be as good as Claude, but if you know what you're doing, it may be good enough to get your work done.

link

garciasn 1 hour ago

A highly skilled carpenter may be able to 'get work done' by banging nails in with a heavy-bottomed cocktail glass, doesn't mean it's not painful to do so when it is continuously breaking and leaving shards of glass all over the workshop for you to find every day for the rest of your life until you clean up the mess you made using the wrong tool for the job.

link

CamperBob2 43 minutes ago

More like, a highly-skilled carpenter can work miracles with a $6 hammer from the hardware store, while the pros on the commercial crew are using fancy compressed-air tools.

The carpenter has to get up close and personal with the wood. He can't match the crew's throughput, but maybe that's not what he's trying to do.

link

data-ottawa 1 hour ago

This is task dependent.

I find devstral (even though it’s weak generally) much better at writing and documentation than Opus. I’m actually now delegating all documentation to devstral and away from Claude, which makes a mess.

link

c0rruptbytes 21 minutes ago

I'm talking about the common use case that I think hacker news people have:

you get a macbook for work, you run the macbook

they're not going to start giving GPUs to employees to run local models

link