| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kamikazeturtles 557 days ago

> I think because they are trained on Claude/O1, they tend to have comparable performance.

Why does having comparable performance indicate having been trained on a preexisting model's output?

I read a similar claim in relation to another model in the past, so I'm just curious how this works technically.

1 comments

wordpad25 557 days ago

because the valley is burning money and GPUs training these and somebody else comes out with another model for a tiny fraction of cost it's an easy assumption to make it was trained on synthetic data

link