| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by NitpickLawyer 139 days ago

> We ran out of fresh interesting data.

No, we didn't. Hassabis has been saying this for a while now, and Gemini3 is proof of that. The data is there, there are still plenty of untapped resources.

> Synthetic data training became a huge thing over the last year.

No, people "heard" about it over the last year. Synthetic data training has been a thing in model training for ~2 years already. L3 was post-trained on synthetic-only data, and was released in apr24. Research only was even earlier with the phi family of models. Again, if you're only reading the mainstream media you won't get an accurate picture of these things, as you'd get from actually working in this field, or even following good sources, read the key papers and so on.

> The fact we worked around those problems doesn't mean they weren't real.

The way the media (and some influencers in this space) have framed it over the last year is not accurate. I get that people don't trust CEOs (and for good reasons), but even amodei was saying there is no data problem in early interviews in 25.