| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ewheeler 1752 days ago
	Maybe? The Scaling Hypothesis[1] suggests that greater capabilities of intelligence may emerge from scaling up 'scalable architectures' to large sizes. GPT-3 exhibits 'meta-learning' capabilities that GPT-2 did not (like learning how to sum numbers)--probably just because its a 100x larger version of GPT-2. [1] https://www.gwern.net/Scaling-hypothesis