| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sgt101 335 days ago
	Andrej "train it on more data and the problem will go away" Karpathy

1 comments

Grimblewald 335 days ago

Right? Im not arguing against the skills he obviously has, but if we're always just piece wise approximating the underlying manifold, then there will always be new problems. The amount of data required to reliably approxate reality, in the absence of an inductive bias, is infeasible to expect to collect. Not to mention how computationally inefficient it becomes as your model blows out in size/complexity.

link

sgt101 334 days ago

I guess the gamble was that there would be a certain point where the edge cases disappeared into the noise and it didn't matter that the approximation was/is "wrong" because the behavior of the car would match the requirements of the situation even if it wasn't for the right reasons.

To be fair I remember reading about GPT2 and thinking that LLMs would blow out for similar reasons.

link

adastra22 333 days ago

Which they more or less have. Larger models are seeing negligible returns. It just turned out that scaling would hold out just enough longer to make LLMs generally useful.

link

Grimblewald 331 days ago

Yup, and if you normalise "improvements vs time" graphs to not linear time but gpu hours invested per unit improvement we're in extremely incremental/small improvement territory as of a year ago. There are no major jumps coming. There are no more gpu hours to allocate to dumping onto this partciular bonfire to keep things looking like exponential improvement, all to keep that vc cash flowing.

link