| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by admax88qqq 2 hours ago
	> One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop. But if you have such a breakthrough could you not also apply it and run 200T models on todays datacenters?

3 comments

ACCount37 1 hour ago

Not only you could: you would also want to.

The likes of Mythos show that the scaling laws are real, and you can x5/x2 the total/active params and get meaningful gains. If "inference per param" gets cheaper? Up the params and get more intelligence for the same price.

link

pennomi 2 hours ago

That assumes scaling laws still hold up. A bigger model might end up only incrementally more intelligent.

link

ACCount37 58 minutes ago

They do. Mythos kicked ass while it lasted. And what we know of the scaling law curves promises us even more gains in the future.

"The future" being "whenever training and inference at increased scale becomes economical". Which is probably bounded by new generations of hardware, but might also be pushed forward by algorithmic advances.

link

phkahler 36 minutes ago

I think they're out of training data though...

link

ACCount37 32 minutes ago

Synthetics are often used for "data amplification" nowadays. Extra compute covers a multitude of sins.

link

deweywsu 2 hours ago

Quite true

link