| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by arnaudsm 1609 days ago
	It's linear for now (check GPT-2 vs GPT-3), but we're close to the point of diminishing returns.

2 comments

bcaine 1609 days ago

It's actually not linear, its a power law. That means we need exponentially more compute, data, and model parameters to see linear improvements in performance.

link

mindcrime 1609 days ago

Part of the problem though, is that we don't know for sure what non-linearities may be lurking out there. Maybe we add 100 more "neurons" to the net and it "goes exponential" so to speak. Or maybe not. There's still a lot we don't know about the emergent properties of these systems as they scale up.

link