More signal that the open-weight models should be our destiny as an industry.
These proprietary models are being used to usher in more surveillance and gatekeeping across the industry.
I have a home server that runs Qwen3.6-35B-A3B through llama.cpp with Open WebUI for the user facing interface.
My teen isn't super interested in AI, but whenever they do feel curious they have their own account they can use on our home network. As far as chatting goes local models are more than capable for handling standard chat questions, doing research, helping troubleshoot problems etc. In fact it was an agent powered by the same model that setup the open webui server and took care of all the account management features through my phone (using Hermes agent).
If you're building AI powered features and using sophisticated agent setups for coding for work, then it make sense to use SoTA from these providers. But I've been using local models increasingly for personal use and am starting to find them preferable (I run an uncensored, ephemeral model for my own use and it's an entirely different experience than anything you can pay for).
Still haven't cancelled my personal Anthropic subscription, but considering it soon.
I guess "starting to find them preferable" suggests to me you think they work better, but this is surprising to me so I think I may have misunderstood, so I ask!
Like you're saying they work better than the proprietary models (in what ways?), or you find them mostly good enough and prefer the privacy or cost, or what?
There are a couple of things, but basically it boils down to the same reason people prefer Linux to Windows/MacOs: customization, control and privacy (arguably all of these are really subsets of 'control').
Having full control over how your data is retained, what the system prompt is, which version of the model you're running, etc leads to much a more consistent experience. For example, for chat sessions, I can't stand the new "let me push back" version of Claude. For my home models I never have to worry about that.
There's never a mystery as to whether the model secretly degraded performance, I always know exactly which model I'm using and how well it's utilizing resources etc. Open models also give you full visibility into the reasoning steps, so you never have to guess what the model is thinking.
Then when you start getting into things like uncensored/abliterated models we're talking about something you can't even pay for. In case you're unfamiliar, even open local models have guardrails built in. But people in the community have found ways to remove these. One of the things I've found most concerning about AI, which is under discussed, is the combination of people having personal chats with an agent that both monitors the conversation and refuses to discuss certain topics. This leads to a very deep level of self-censoring I find dystopian.
I also have multiple hermes agents setup, some with local backends other with open but non-local backends (e.g. Kimi through the API). For some tasks, I've just started to find the local agent tends to work better for the type of tasks I want (maybe it just over thinks less?). I don't use it for coding so much as research tasks and sysadmin stuff, but I've been really happy with the results.
Oh, and let's not forget, especially running on a Mac, these local models are basically free to run.
From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
>From a privacy perspective, your objective is to stay away from people who have interest to snoop on your conversations.
>So from the perspective of your teen, they would benefit from using z.ai or ChatGPT or Claude, etc, rather than the local server where you can see all the conversations.
That is bonkers. If I were a parent, I would hope my child would trust me more than systems monitored by FBI/NSA/etc. Like, what sort of sick relationship do you have to have with your own family to trust them less than strangers who would sell you into prison slavery for a buck.
Wasn't the parent post referring to 'legitimate' demands? I often use them to get a broad overview of a technical field before reading human stuff on it, and it might be me but those clankers tend to spend half their reasoning on whether they are allowed to reply to my request. Censorship is an annoying waste of capacity for certain use cases, although it certainly has its boons when shipping commercial models.
They are not going to let open weights models with zero restrictions exist dude. They will be regulated like guns, or probably closer to nerve gas or enriched uranium.
I don't know that I want to stop such a thing. It's good that nerve gas is banned. I don't want random people having access to easy-to-follow instructions to make COVID-29.
Because (collective) we don't own the tech. Frontier models are proprietary, their reasoning logic is hidden, and as seen with Fable the government giveth and taketh away on a whim.
Capabilities can be gated behind certification programs, or by money, or any other numerous corrupt and non-corrupt means. Model capabilities can be segregated by pricing tiers, creating an economic underclass that cannot afford access to frontier intelligence.
For humanity to benefit, the tech needs to be open and equally available to all.
I agree with this. Computing as a field is the way it is because there is a low barrier to entry. My dad gave me a Tandy 1000 and some programming books, and now I have a very lucrative career. I never took any classes. I never had to beg anyone for permission. I could just get started making things with the minimal investment of a cheap personal computer. (And eventually, an Internet connection. Working with other people is fun!)
In a world where everyone is a Claude controller (something I honestly enjoy!), that goes away. I use hundreds of dollars of tokens a month. Suddenly, the kid in her basement with an unloved computer can't get in on the ground floor. You have to be rich to even get started. That worries me deeply. It's a big change for our field, and I don't think it's a good one.
Did your dad give you a Tandy 1000 or a Cray X-MP/48? Do you really think you need the most top-of-the-line model to learn anything, or will a locally run gemma4 (or whatever it turns into) still get you going just the same as when you were a child?
Do you hate all lessons from humanity's past or just the most important ones? If it takes work from a specific subset of the population and isn't compensated, then my friend, what you advocate for is slavery...
One is the potential for skill rot where AI grows a heavy dependence in new employees and once the real price per token cost is settled on and discoverable (post massive IPOs and probably a while post - not immediately after) we, as a society, are left with a bunch of people dependent on a deeply inefficient technology to maintain software we now view as vital that might severely impede our ability to actually deal with climate change (press X to doubt Bezos).
The second is that the psychological damage of interacting with models in a social context during your formative years is deeply damaging and we've essentially destroyed the ability for a generation or two to actually interact as productive members of society.
Addressing the second issue doesn't necessarily exclude our ability to leverage models for business productivity but it seems unlikely to happen in the current climate without that also happening. I am hesitant to believe in a sudden outbreak of common sense at this point. The first point, could really be a systems collapse trigger - we can argue about the likelihood but denying it as a possibility is excessively naive.
Both seem to just point at the WALL-E outcome, summarized as humans outsourcing too much thinking. I just don't see that as an end- just another divide between people. I'm seeing some degradation for sure, but not really an "end".
I agree with the skill drain argument but also think its a little too dramatic. Most people still can do the shit claude does for them, it just takes them 10x as long.
But "some assholes" is an extremely large, growing group of people. Do you have any idea how much more productive small business owners are now? It's an insane boost for people who didn't want to spend their time on things that are extremely critical for business but not the focus of the business.
And people loved "free next day delivery" from Amazon, when it started. It's not quite the same level of service anymore, and membership has gone up in price.
Would these businesses pay 2x? 5x? 10x? What is their breaking point? I'm sure xAI/OpenAI/whoever will find it and charge 0.9x that (eventually). Just look at telecoms / internet access and their rubbish "network congestion" claims to keep raising prices.
How can it end well, when it's mostly owned / controlled by narcistic billionaires who would love to eradicate anyone who so much as looks at them sideways? And who view "mass population reduction" and "I'll get to be a king in my castle, served by peons who depend on my favor to live" as the most desirable outcome of AGI?!?
If even one of these had pledge that all profit goes to end world hunger, cancer research, etc, I could possibly see it - but they haven't. They're all after finding a way to be the biggest, richest asshole possible with the ability to crush anyone in their way..
Have you isolated yourself completely from reality? I don't even know where to begin on this. Let's start with the fact that China is pumping out some near-frontier models and open sourcing the weights- and they don't even follow capitalism and the owners aren't billionaires. Really there are like four models in the USA that are "owners/controllers", and only one is even slightly controllable by its CEO, though none of the frontier models can last a week without the support of entire teams.
Why on earth would you want to siphon off the proceeds of AI development to (ok my bias is strong here- mostly corrupt) "ideals" like world hunger and cancer research (that probably get more dollars annually than the sum of actual profit any of these companies will ever get). That would just instantly kill the ability to improve AI at all, and the world could possibly be better for a few months?
They can't prevent the innovation, competition and engineering, but their lobbying makes sure that the Chinese competition doesn't enter the market, and if it does, with severe obstacles on the way.
Their biggest customer is the US federal government, taken in aggregate across agencies, IBM is one of the largest federal IT contractors, and deep public-sector and financial-services contracts in the US make it IBM's single largest national market. No individual commercial company comes close to the government's aggregate spend.
Now, equivalent product, another company, they want to sell to the government twice cheaper, can they ? nope, it will be IBM winning.
Furthermore, according to the lobbyists, China = evil but they forget that a lot of software contains Chinese code.
i’d really love to be wrong, i don't think that the economics of it would let it happen.
the potential of wealth creation with AI is so high, and also the fact that research, pre-training and inference is so expensive that, that any open-AI would eventually become OpenAI.
There is an understandable gap between the capabilities of closed models and those of open models.
The current difference is primarily expressed in the cost of hardware necessary to sufficiently run a exactly comparable model.
A single higher end graphics card running on your average gaming computer, is capable of running small to medium models that compare with those of their lab-born counterparts in the small-medium range. But the heavyweight models are still outside the realm of possibility for all but the most well-funded individual.
However, I would highly suggest more people experiment with these smaller models. They are incredibly capable in many ways that many people dont realize.
The perceived capabilities of the larger models are also much less the result of the model having more parameters/training cycles, but rather that they are being run through well-made harnesses, something which the open-source community is rapidly approaching with near-peer solutions of their own.
In short, much of the gap between between open-weight models and the larger proprietary models can be considered more of an issue of perception and not an issue of capability. There is a fundamental gap economically, but not so much in capability.
The open source community is rapidly closing the gap on these larger labs, especially thanks to the amazing research being freely given openly by well funded chinese labs.
See my comment to parent. I've been using local LLMs for practical, personal tasks for a few months now very successfuly.
You can run fantastic local models if you have either:
- M-series Apple device with ideally >= 24GB of VRAM
- RTX [345]090 GPU
I'm fortunate enough to have both and use an M-series laptop as basically a persistent server (I don't use it much and when traveling typically just use my work laptop). My desktop doesn't act as a persitent server but I fire up llama.cpp on it all time for quick chat sessions.
If you have one of the above devices and can dedicate it as server there are additional layers of tooling you can use that dramatically improve the experience. In particular Open WebUI allows you to add tons of useful tools (image gen, web search, code eval, etc), and agent harnesses like Hermes can make the current gen small models very capable. I have an agent in chat on my phone that basically handles all the sys-admin for the server it runs on.
I'm also curious, specifically about the cost of training vs inference, and comparing that to other industries that can have high R&D costs. My instinct says that open weights aren't feasible because of the obvious issue where there is no incentive to develop your own model rather than just taking someone else's model. However, I could see a scenario where a hardware company designs a model that is open weights but optimized strongly for their own proprietary hardware, cutting their costs of inference low enough to be competitive with a hypothetical other company that doesn't have any R&D expediture.
Sort of. A full trillion-parameter model needs about $300k of server hardware to run in and a lot of electricity, making it feasible only for very wealthy individuals, but quite practical for businesses and institutions above a certain size...although they in turn would typically gatekeep access.
You can drastically reduce the requirements by running models at a lower bitrate, which somewhat reduces accuracy but not that much - think of the difference between an MP3 vs uncompressed audio. With this and other tricks, you can get high end models down to a size where they can be run on a high spec desktop workstation affordable by an individual or small business.
Obviously I'm heavily oversimplifying here. I think a useful parallel is to consider situations from the past where you would once have required corporate budgets equivalent to the price of a house to run a large database, but over time it became accessible to anyone with the requisite expertise and relatively affordable hardware.
You can run a trillion parameter model with decent quality for far less than $300k. A cluster of 4 AMD AI Max 395+ boards with 128GB unified memory each can be had for around $15k. That would run the 4-bit quant of a trillion param model well enough for personal use. At full use the cluster would only be consuming around 400-500W of power too. That's about the same as one high end graphics card.
That's still a lot of money, but most people don't really need a trillion parameter model. If privacy is more valuable than the frontier capabilities then they could almost certainly get by with much less.
It depends entirely on what you want to do and think is feasible. Small models can almost certainly run on the computer that you already have. They can do good tool calling.
If attractive, cloud providers could develop open models with their own investment, and sell hosted access as a business model. While Google checks these boxes, I haven't seen a Google much marketing focus upon their open models (Gemma) coupled with hosting. groq could conceivably train its own models, but groq's business model hosts open models (GPT OSS, Qwen 3, Llama 4 are currently their prominently advertised models on their site... which seems out of date to me) trained by other organizations.
I hope/wonder if it will go the way computers did. We may learn to more effectively build RAM or parallel compute, and use it more effectively, in the coming decade in such a way that we can democratize more and more like we did with processors to the point that they're ubiquitous.
I'm happy to give my identity to Anthropic and crush my competition with irrational fear about privacy and personal data. This is a serious competitive advantage and a moat.
More signal this won’t happen without some serious social unrest, not garden variety Jan 6 events… and the window is closing rapidly - when this tech gets sufficiently advanced there won’t be a place to hide.
EDIT: Parent commenter completely rewrote their comment while I was replying. I'm leaving this up as is. The text below is their original comment that I was responding to.
> What are you pushing by pointing out not only in this thread but the previous one too, quite in depth, that it's not new? I know you can claim you're just a stickler for accurate reporting but you seem really invested
Because these threads degrade into panic with the assumption that everyone is getting ID checked now.
Pointing out that the policy has been in place for 2 months puts it into perspective that this isn't a sudden policy change requiring ID for everyone. It's been this way for months.
If they roll out mandatory ID checking for everyone, that would be a different story.
If you have an actual rebuttal, present it. Don't use thinly veiled implications that OP is some sort of a shill as a substitute for a cogent rebuttal.
It's a little bold to assume that 15% of the US population means the entire US wants this. We founded this country against unwarranted government interference in our personal lives, it's why the fourth and fifth amendment exist.
The US is behind other democracies which have required photo id for social media and other content. And even if I disagree with these laws, surely you jest that showing a proof of age is not the same thing as surveilling and scoring.
These things are always a slippery slope. They rarely, if ever, achieve their safety goals but they almost always achieve the goals of the corporate interests to garner further data for advertisers and increase surveillance of the populace by the government through proxies that buy said data and then sell to the government.
It is using the proof of age requirement to require a much larger ask -- full proof of identity
Age verification could be done with any of a variety of mathematical systems showing you have a proven age-valid ID but not revealing your identity. But no one is suggesting they build and use such a system.
Because one is a private company that people can choose to use or avoid. The other is a government that can force things upon people. How are they the same in any way?
You know many companies check ID, right? You submit ID for a lot of activities. This isn't a new concept that Anthropic invented.
>the West was complaining about surveillance and scoring system of citizens in China
free speech, civil liberties, voting, are in China all well below the standards of the west. The criticism and complaints were completely warranted and are still true today, whereas your comment falsely implies there is some parity.
could your comment be repaired to be reasonable? why bother, just read the rest of this discussion where people are debating these controls without trying to exonerate China.
The point is that you're all shitting on China 24/7 while not recognising that you're slowly but surely building something very similar at home right now
Well, the powers-that-be saw how a society that doesn't allow a lot of open criticism works in the form of China. The massive returns on investment, the near-permanent ruling class in the form of party cadres, etc. Then they decided they want that for themselves.
If you do business with totalitarian societies that aren't made to liberalize, you too will become a totalitarian society.
It’s funny, 30 years ago the argument was the exact opposite: China opening up and doing business with the rest of the world would force them to liberalize.
That was the argument, yes, but let's be real: the reason that capital loved China wasn't because they were going to have to deal with trade unions and citizen initiatives to constrain their ability to unlock value. If that were the case, the then-newly-democratized Eastern Europe, or maybe India, would have gotten a lot more attention from business than China.
No, they liked China because the standard of living meant that it was easy to improve people's lives while also keeping them in line via a government that wasn't above grinding protestors into hamburger with tank tracks. The bar to clear wasn't "maintain the American standard of living", it was "provide more calories than Mao did during the Great Leap Forward", and so long as they could do that, they'd get to do whatever else they wanted with the workforce. Anyone who wanted more would get to deal with the CCP.
Countries such as Canada are in the process of implementing regulations to prevent repeats of the Tumbler Ridge incident. A disturbed person was basically attaboy'd by AI into a mass shooting. The discussions this person had with OpenAI's AI triggered some alarm bells at OpenAI, but they did nothing about them. If future shooters were to simply use AI chatbots under assumed names, there wouldn't be much AI companies could do about it, except maybe change their bots to stop offering mindless affirmation. At the same time, there is a move by multiple governments around the world to ban children from using AI. You can't meet that legal requirement without age verification.
On the other hand, even Americans don't trust their own corporations with their personal data. People outside of the U.S. are even less trusting thanks to the completely amoral nature of the present U.S. administration and their steadfast opposition to any kind of sensible regulation.
Having my engineers swap over to it from Claude has garnered very little complaint. The lack of multi-modality is a limitation, but using minimax m3 for that isn't super inconvenient.
Considering that you need a credit card to pay for the tokens, why does anthropic need to verify your age or identity? Yes, I suppose some kid could steal my credit card, but I've got bigger problems if that happens...
Maybe access should be enabled only for large trusted companies? If every American has access how many of them would gladly sell their verified account to a stranger from Internet who cannot pronounce "th" clearly?
Also, Anthropic will maintain and use data in user identified form if the law does not prohibit such privacy intrusion. At least this is a valid interpretation imho; note the absence of "explicitly" as adverb for "permitted":
«Where data is de-identified, Anthropic will maintain and use this information in its de-identified form, and will not attempt to re-identify such information, except as permitted by law.»
If it's an actual AGI, it'll figure out how to use a fake ID and the face of Sam Porter Bridges to bypass the age checks.
Now I can't help but imagine a mildly annoyed AGI buying yet another fake identity to deal with yet another KYC check, because those stupid humans just can't help themselves but keep demanding "proof of flesh".
From their terms: 'Identity and Contact Data: Anthropic collects identifiers, including your name, email address, and phone number when you sign up for an Anthropic account, or to receive information on our Services. We may also collect or generate indirect identifiers (e.g., “USER12345”).'
This feels deeply problematic. I would much prefer, where asked via appropriate legal processes, Anthropic serve over user data to government officials, and potentially suspend access.
We’ve been in sort of a golden age where massive money is getting pulled in and consumers are getting a great deal. That’s not going to last, and surely surveillance and personal information are going to fit into the formula for success for these companies. It’s very similar to when Google was a brand new search engine.
>Does that mean: US citizens will get an edge in hireability?
In the present situation any company using Fable will present a tremendous difficulty because only defense contractors are accustomed to handling export controls.
We're still guessing but if Fable is made available again with the export controls intact, something as little as discussing the usage of Fable to a non-"US Person" (i.e. green card or citizen) in the cubicle next to yours could be a crime punishable with sizable fines and even jailtime. They'll certainly be negotiating this down or trying their best to reduce the scope of what's considered a violation. Export controls are no joke and what's considered "export" can be positively tiny.
US regulations/laws are hostile enough that the EU is looking to distance themselves from all US software, hosting & cloud providers. This administration has shown that they're quite willing to stab every other nation in the back on a whim.
> As tensions between President Donald Trump and Europe continue to simmer, the continent is accelerating its moves to reduce its addiction to US technology. Cities and governments are ditching Microsoft Office for open-source alternatives, shifting to European cloud hosting for local AI, and moving defense data to systems without American involvement. Nowhere has this been more clear than in France.
> The Netherlands blocked a U.S. company from buying a Dutch firm that handles its national ID system, saying it would create a “threat to the public interest.”
The amounts of capital sunk into AI model creation and service is truly mind-boggling. It also comes with the implication that it'll recoup investment by slashing jobs. For better or worse, those are hard sells in the countries you mentioned.
> For better or worse, those are hard sells in the countries you mentioned.
For good reasons, sometimes. The "all automation is good automation" sentiment on places like HN isn't shared as widely outside this tech bubble. There are very real concerns with historical precedent that only those at the top will benefit from the automation, which is overall bad for society (unless you're a hardcore capitalist and/or one of said capital owners).
For better or for worse, not all nations subscribe to the competition treadmill.
That's fine, I already cancelled my subscription after they admitted to using PEFT to selectively and silently make their models dumber when working in certain technical fields.
GLM-5.2 meets my needs for "thinky" tasks, which for me is code and documentation reviews, technical chats and rubber ducking. (I've tried agentic coding and gone back to writing by hand; besides ethical and skill atrophy concerns, I mostly do hardware design and have not been satisfied with any model's RTL output.) API rates are cheaper than Haiku, with benchmarks around Opus 4.6. I've managed to run GLM-5.2 at home, very slowly, but still neat that this is possible. I personally find it less grating to talk to than Opus.
I use a local Qwen3.6-35B-A3B (@ Q4_K_XL) for my documentation search harness. It works well for its assigned task, which is:
- I dump in a bucket of PDFs and/or source code.
- I ask a question.
- Qwen greps, fuzzy-searches, views rendered PDF pages to check diagrams, possibly gives up and reads everything, and possibly gives up on that too and writes its own scraper with PyMuPDF in a Pyodide sandbox.
- Qwen gives me an answer consisting mostly of citations and links back into the source material.
This approach with local Qwen can extract useful answers from the Armv9-A manual, which at 17k pages is possibly too big for any context window. Qwen has just enough knowledge baked in to know what to search for and understand what it's looking at. A more knowledgeable model would be a waste because even Fable makes shit up, and I want citations, not hallucinations.
DeepSeek v4 Flash gets an honourable mention: somehow all three of fast, capable and cheap. Zero-data-retention providers are available for both GLM-5.2 and DSv4F. I trust OpenRouter ZDR about as much as I trust Anthropic ZDR, since I can audit neither.
Overall I don't miss my Claude subscription, but take what I say with a grain of salt. I was just a Pro subscriber, not a heavy user like some other folks here.
No, the assumption is that you must be 18 years old to apply for a credit card. Surely we could have the machines determine that an "authorized user card" does not guarantee 18+ but the actual card does.
ceejayoz> You want to let every merchant I swipe my card at know my age? To improve privacy?
Remember the site guidelines:
SG> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize.
The obvious solution is instead of "every transaction comes with the user's birthday", the vendor can in some way set a minimum age enum of say (13, 15, 18, 21, 25) — a handful of ages that are significant with respect to some law or regulation. Then the transaction succeeds or fails.
If programming requires LLM/AI then regulation by government is needed to stop this overreach, which has the primary goal of banning you permanently forever making sure you can never come back to programming, in the event some AI in their system decides you have done something “wrong”.
They aren’t, but we’re riding an exponential here. It’s like saying ‘you can still build a computer out of transistors’ in 1976 - as true and irrelevant today as it was then.
Hmm, is this a thing for enterprise accounts too? My employer has gone all-in on Claude, but if I get a pop-up that requires me to give my ugly mug to a literal cardinal enemy of the human race Peter Thiel, then I will have to seriously consider switching jobs, because I have some of them silly principles.
I should be worried about this, but Anthropic's products are a paid product. You can't use them without providing some identifying information, unless you're going out of your way to provide them inaccurate information.
I generally dislike services which require this level of identity verification but also, so far, those have mostly been freemium services and community tools. And I dislike gating those communities.
I'm sure I should have more of a problem with this.
The British company doing age verification for Discord got hacked and the hackers got about 70k user identity documents. Discord claimed that the scanned documents would be deleted after verification. Surprise! They were not deleted at all.
What about a signed attestation of your identity based on your passport? I don’t particularly want a future where we need to present ID for any online service, but for certain high-risk services (e.g. financial services, medical records, government portals) I’d rather a proper identity system than cobbling something like this together.
As an aside, when traveling internationally it’s not uncommon to need to provide your passport information if you want to get a sales tax rebate. I’ve never purchased something expensive enough abroad to bother with it.
Elon: I will burn 20 years of goodwill I have gathered with the tech community.
Sam Altman: I will make sure to increase the price of all semiconductors, so you are not the most hated.
Dario: You are not leaving me here alone to be the good guy, so hold my beer.
This is their only option. Someone who is the head of security for a hospital IT department needs access to mythos, and some 17 year old with fraud convictions doesn't.
It's the same reason we require ID for alcohol and gun purchases. Obviously it isn't a perfect system, teens drink but good luck suggesting that 13 year old should be allowed to buy alcohol.