Hacker News new | ask | show | jobs
by fpgaminer 1157 days ago
I don't recall the Chinchilla paper disputing my point. They establish "training-compute optimal" scaling laws, but none of their findings suggest that loss hits any kind of asymptote.
1 comments

Perhaps we're talking past each other, is "loss threshold" a specific term in LLM literature?

Merely pointing out that the debate as to whether we are compute or data limited (OP) has not concluded at all; There are lots of compelling theories on relationship between the two.