Hacker News new | ask | show | jobs
by jrochkind1 2 hours ago
What about local models do you find preferable?

I guess "starting to find them preferable" suggests to me you think they work better, but this is surprising to me so I think I may have misunderstood, so I ask!

Like you're saying they work better than the proprietary models (in what ways?), or you find them mostly good enough and prefer the privacy or cost, or what?

2 comments

There are a couple of things, but basically it boils down to the same reason people prefer Linux to Windows/MacOs: customization, control and privacy (arguably all of these are really subsets of 'control').

Having full control over how your data is retained, what the system prompt is, which version of the model you're running, etc leads to much a more consistent experience. For example, for chat sessions, I can't stand the new "let me push back" version of Claude. For my home models I never have to worry about that.

There's never a mystery as to whether the model secretly degraded performance, I always know exactly which model I'm using and how well it's utilizing resources etc. Open models also give you full visibility into the reasoning steps, so you never have to guess what the model is thinking.

Then when you start getting into things like uncensored/abliterated models we're talking about something you can't even pay for. In case you're unfamiliar, even open local models have guardrails built in. But people in the community have found ways to remove these. One of the things I've found most concerning about AI, which is under discussed, is the combination of people having personal chats with an agent that both monitors the conversation and refuses to discuss certain topics. This leads to a very deep level of self-censoring I find dystopian.

I also have multiple hermes agents setup, some with local backends other with open but non-local backends (e.g. Kimi through the API). For some tasks, I've just started to find the local agent tends to work better for the type of tasks I want (maybe it just over thinks less?). I don't use it for coding so much as research tasks and sysadmin stuff, but I've been really happy with the results.

Oh, and let's not forget, especially running on a Mac, these local models are basically free to run.

The local models are willing to share their thinking. The Big AI models don't share their thinking, leaving only vague summaries. Having an AI that deliberately cloaks it's reasoning, that goes out of it's way to act like a Searls Chinese Room Experiment, that deliberately conceals information is incredibly gross.

I love what I get from Opus or GPT, but mainly I use GLM and it's so starkly apparent how much better it is that it let's me work together with it, that I can nudge it as it works by correcting bad assumptions or clarifying for it, as it works. And... it just doesn't feel icky. It's not a quasi-mystical alien intelligence, which, honestly, gives me strong "this should be destroyed, is unsafe, and feels outright impermissible" vibes. As a coder, seeing thinking saves time and prevents errors. As a civilization, seeing thinking let's people understand what the AI is working with and grounds society in an appreciation for what is happening, keeps us a little moored. Personally, if I were a government, I would not allow it.

Recent submission on this, The text in Claude Code’s “Extended Thinking” output is not authentic. https://patrickmccanna.net/the-text-in-claude-codes-extended... https://news.ycombinator.com/item?id=48630535