| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by maxloh 6 hours ago

Great to see more fully open LLMs.

I think a problem with open-weight models is that while you can improve them, you are not going to create the next generation of LLMs by fine-tuning. We are at the mercy of frontier labs for access to SOTA LLMs. For example, Anthropic recently started requiring identity verification for Claude [0], same for OpenAI [1].

If one day China's distillation labs stop releasing their LLMs as open-weight, I doubt American labs will continue to release free LLM weights without that competition.

That's where fully open pipelines shine: they enable the community to create the next generation of SOTA LLMs. That is the only way LLMs truly become sovereign.

[0]: https://news.ycombinator.com/item?id=48618455

[1]: https://news.ycombinator.com/item?id=48618606

2 comments

anon373839 6 hours ago

> China's distillation labs

This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.

Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.

The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.

trollbridge 3 hours ago

It's one of those lies people tell themselves to make themselves feel better. "Oh, they're just copying my stuff."

Chinese labs are basically just telling everyone, out in the open, what they're doing and how to do it, and the answer from American frontier labs is "Well, they couldn't possibly be getting the results they're getting without just distilling our models," and the American labs aren't even trying to do some of the stuff like DS's aggressive caching to get costs down.

Vaslo 5 hours ago

I recently watched a video for one of these “Chinese Models” it kept insisting it was Claude when the user asked. Sorry, there’s no “slur” here but legit suspicion.

c0rruptbytes 4 hours ago

https://blog.kilo.ai/p/did-claude-opus-48-distill-alibabas

it happens to all models…when the internet is increasingly generated, things happen

anon373839 4 hours ago

These anecdotes where someone gets the model to claim it is X model are meaningless. (Claude also has been known to claim it is Deepseek when asked in Chinese.)

trollbridge 3 hours ago

As anyone who's tried to write an AGENTS.md that says "Place an Assisted-by: git trailer that contains the harness you're using:whatever model this is"; such a naive approach often results in a seemingly random model.

halJordan 5 hours ago

But have they? I understand that the Chinese side is illuminated and the American side is dark. I disagree that the Chinese labs have created anything that isn't in an American research lab or production dc. Sure the Chinese have published their findings and not for nothing. But are they novel? Unlikely imo

chriskanan 4 hours ago

They are doing ta tremendous amount of novel research where American AI companies have "war rooms" to study their papers and models and American labs publish next to nothing. They have to often do more with less. As an AI researcher, Chinese labs are doing tremendous benefit to science whereas some American companies (and I'm American) seem to think only they are able to do AI research responsibility (I've been working on neural networks for 25+ years). I'm pretty sure Fable sabotaged my research codebase (see the news stories about this).

david_shi 1 hour ago

Whoa, say more about Fable sabotaging your codebase?

dofm 6 hours ago

> We are at the mercy of frontier labs for access to SOTA LLMs

I disagree with this use of SOTA, and this topic is why.

Anthropic and OpenAI have “cutting-edge” models. These are beyond the state of the art but they are closed, secretive, hard to quantify.

The “state of the art” is open source, open weights models that can be inspected, studied, shared and critiqued, because that is what is meant by “the art” —- it is the knowledge and principles and evidence and materials available to all. The “state of the art” is the highest point of that.

I wish we could make this distinction and stop blessing two secretive, unverifiable loss-making companies with so much power.

(Putting that aside, I suspect — without evidence, mind you - that the endless march to solving models by making them bigger is not the solution anyway.)

MangoCoffee 3 hours ago

SOTA LLMs is less important than cheap token and Chinese AI labs is releasing model that is only about 6-8 months behind American AI labs.

Chinese's model like GLM is getting better for coding task and its cheaper. Microsoft Github copilot have to switch billing to token based. the cost of AI have increased since agent come into play. whoever can offer cheaper token to do task will win.

even Microsoft is looking into Deepseek for cheap token.

https://www.axios.com/2026/06/16/microsoft-copilot-cowork-to...

sockaddr 6 hours ago

Sorry but I think you’re requirement that something only be “the art” if any arbitrary person can critique it is off. The frontier labs are working on the state of the art but it’s just art that you aren’t allowed to see. Unfortunately.

dofm 5 hours ago

It is work using the principles of the art, obviously.

But "state of the art" implies the highest state of general availability, not just in terms of access to some product, but of use of the ideas, concepts, methodologies etc.

Anthropic and OpenAI have "cutting edge" models; the state of the art is behind the cutting edge.

The state of the art is the best open source, open weights model available. More or less by definition.

I am probably tilting at windmills here.

bnj 4 hours ago

I appreciate this distinction. The are multiple senses of SOTA and one that has been taking on greater mindshare is as a synonym of “the best available”. By rebasing on SOTA as generally available and understood versus cutting edge, which has limited distribution and leads the way, we expand the vocabulary we have available to describe what’s going on. Thanks.

toss1 3 hours ago

That's an interesting and possibly useful distinction , but it seems unique to you. Spreading it as "We should categorize the AIs this way" would be a good argument.

But the way SOTA is generally understood by other users of the language, it refers to exactly the team, technology, & techniques defining the cutting edge in any field, regardless of the whether the technology & techniques are available outside of that team...

8note 5 hours ago

the art is the standard engineering practices that go into building the thing

its things you would be trained in as part of a bachelor's degree and some graduate coursework