| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by trunnell 48 days ago

I'll defend Anthropic.

They are clear about the reasons for guardrails: prevent their models from doing harm in dual-use contexts including CBRN or by accelerating research in authoritarian-backed AI labs.

What is the critique against that? It seems pretty reasonable to me. You want AI-accelerated biological or radiological experiments running in your neighbors backyard? You want PRC-backed labs to continue to steal Anthropic's models via distillation?

Mitigating the harms of dual-use tech is notoriously difficult and fraught with trade offs. What I would want to see is cautious rollout and quick response, which is EXACTLY what they're doing.

Instead, this thread is full of bad-faith arguments about Anthropic being dishonest, making a "useless" model, or "the power is going to their heads." You can't read Anthropic's System Cards and come away with any of these impressions. Quite the opposite, in fact. They are honest to a fault, acknowledging problems they discovered even when it hurts them.

If your harmless request was downgraded to Opus, you're billed for Opus. They were 100% clear about that. I'd much rather have a Mythos-class model that falls back to Opus 10% of the time than be capped to Opus 100% of the time. If that doesn't work for you, then make a suggestion for something better!

If you are a white-hat security engineer hitting guardrails, I don't think you have standing to complain. I really don't. Their Glasswing program actually got banks and the industrial sector to take action to fix security vulnerabilities. Do you realize how special that is? A huge portion of the economy runs on vulnerable code and has for decades, despite security experts testifying to Congress, begging business leaders, pleading for intervention-- with no results. But suddenly they're all enrolled in a program that will find *and fix* vulnerabilities! White-hat security people should be rejoicing. Instead some of them are throwing rocks. Unbelievable. Shameful.

Meanwhile, society is screaming at the AI labs to be more conscientious about potential harms of AI. Legislatures are passing laws limiting data center construction. There are protests. And you, the HN community, the vanguard of our profession, have the temerity to demand "NO GUARDRAILS!" "HOW DARE YOU TRY TO PROTECT DEMOCRACY!" "MY SOFTWARE PROJECT IS MORE IMPORTANT THAN KEEPING NUKES AWAY FROM THE BAD GUYS!"

Go ahead HN, downvote me. It'd be an honor.

2 comments

zozbot234 48 days ago

The original reporting of this from Anthropic didn't mention "authoritarian-backed AI labs" at all, only frontier ML research while leaving it entirely unspecified and unverifiable what was meant by "frontier". It's obviously reasonable that people would complain about that. And the notion that distillation-at-a-distance could be used to comprehensively "steal" a model, especially a frontier reasoning model that's likely relying on massive amounts of test-time compute, is completely unproven and quite ludicrous if you know anything at all about ML.

link

trunnell 48 days ago

"Anthropic accused Chinese firms of 'industrial-scale distillation attacks' on its AI models."

"Distillation involves training less capable models on more advanced ones’ output, and can be used illicitly to acquire powerful capabilities cheaply. The AI startup accused China’s DeepSeek, MiniMax, and Moonshot of generating 'over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts,'"

https://www.semafor.com/article/02/24/2026/anthropic-accuses...

After reading their posts and watching interviews with Dario it's abundantly clear that they view Chinese-lab distillation of US frontier models as a threat to US national security. You can argue with them about whether that is true, but not whether distillation is real.

link

zozbot234 48 days ago

It's definitely real, in the sense that it's a real violation of ToS. It could perhaps be used to guide a few narrow capabilities in very specific domains, given a model that's already most of the way there. But no, it's nowhere near the same as "stealing" a model outright, nor does it replace basic innovation in AI. And it's indistinguishable from practices that have long been common in the industry as a matter of fact, regardless of any ToS requirements.

link

trunnell 48 days ago

Oh, I agree distillation isn't stealing "outright" as in it's not theft of 100% of the model. But there's a reason they're doing it. I didn't say anything about Chinese labs innovating -- obviously they are.

What accounts for the difference between your attitude that distillation is no big deal, "common practice," yet Anthropic sees as it as a huge threat?

link

zozbot234 48 days ago

I never said that "it's no big deal". It's a clear-cut violation of ToS, and Anthropic are within their rights to care about that.

link

vzcx 47 days ago

Having a chatbot that talks to you about synthetic biology or nuclear physics is just not the same as being equipped to develop biological weapons or atomic bombs.

None of this will happen in the "neighbors backyard." You are exaggerating the threats to "democracy" while simultaneously invoking democracy to limit freedom of information. The suggestion that somehow the bad guys will get nukes if we let people access information is just absurd.

Society at large is not concerned about whether someone asks the chatbot about organic chemistry. They are concerned that they will be de-facto forced to interact with some shitty automated system to get by in life, like having to pass an AI-powered ATS to get a job.

They are tired of the hype and tired of idiots like Amodei being elevated to heights of power and influence. They are concerned that the things they love are being devalued. But they don't give a fuck if I ask an AI about genetically modifying viruses. This is a pet issue among some of the AI safety crowd.

So, yes, I am 100% fine with PRC-backed labs distilling Anthropic's models. I do not care about Anthropic. They have demonstrated that they are not on my side, and that they are at best ambivalent about actually empowering their users. I'm not a fan of the PRC either, but their distance makes them far less of a threat to me than companies like Anthropic and my own government.

link