|
|
|
|
|
by ewheeler
1705 days ago
|
|
Maybe? The Scaling Hypothesis[1] suggests that greater capabilities of intelligence may emerge from scaling up 'scalable architectures' to large sizes. GPT-3 exhibits 'meta-learning' capabilities that GPT-2 did not (like learning how to sum numbers)--probably just because its a 100x larger version of GPT-2. [1] https://www.gwern.net/Scaling-hypothesis |
|