| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ydau 2198 days ago
	GPT-3 was another buckets worth of evidence in favor of the scaling hypothesis. Performance kept improving (and cost to train kept increasing) as more parameters were added. Even with 175 billion parameters, the performance had not yet plateaued. One take-away is that throwing a lot of compute at the problem helps tremendously :). You can read more about GPT-3 here: https://lambdalabs.com/blog/gpt-3/

1 comments

Are you alluding to "The Bitter Lesson" [1] by Rich Sutton [2]?