| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fpgaminer 1157 days ago
	I don't recall the Chinchilla paper disputing my point. They establish "training-compute optimal" scaling laws, but none of their findings suggest that loss hits any kind of asymptote.

1 comments

midland_trucker 1157 days ago

Perhaps we're talking past each other, is "loss threshold" a specific term in LLM literature?

Merely pointing out that the debate as to whether we are compute or data limited (OP) has not concluded at all; There are lots of compelling theories on relationship between the two.

link