Hacker News new | ask | show | jobs
by andy99 5 hours ago
> Chinese labs must entirely retool from harvesting frontier model data to producing the data systems and efforts to produce novel data

Even if your characterization is accurate, they could do this tomorrow and are not so myopic that they wouldn’t have thought about it. I don’t see this as a barrier, and I see a lot of the same underestimation of Asia that’s been happening for 50 years. There’s not some innate American advantage to building LLMs, and personally I think whatever head start the US has is going to be squandered on delays from the export control “to dangerous for release” LARPing we’re seeing.

3 comments

I am not sure which part you are interpreting as underestimation or whatever? Quite the opposite: I claim the difference arises from a difference in strategies, not from intrinsic differences in ability.

Also I was responding to a claim about what will happen in less than 6 months (that’s about the edge of what you can meaningfully say too much about in this field).

These strategies take materially different resources; it’s not an overnight decision made by leadership. I suppose there is a natural experiment ongoing at Meta regarding this, it seems they recently moved a number of people into a division to produce such data overnight. So we will find out soon how quick they climb the leaderboards.

Exactly. If they wanted to they could produce the same amount of data. Companies like Scale, Mercor, Surge exists for a reason, a reason that doesn't need to exist in China if they mandate Chinese enterprises to provide all their real world data (or have them work inside RL environments) to the model companies for post training. There is no real advantage that US companies have except a head start, and as Jensen said, a ton of the research advantage is skewed since a lot of the best researchers in the US are Chinese nationals. I do think the model is just one piece of the pie (not to echo Jensen too much), and hopefully we will always be able to serve these bigger frontier models in a much more efficient way as well as building out the application layer faster which actually makes them useful and/or more dangerous/powerful.
Why would those have any impact on R&D speed? Most are funded and close to cash flow positive