|
|
|
|
|
by strin
2212 days ago
|
|
> “GPT-3″ is just a bigger GPT-2. In other words, it’s a straightforward generalization of the “just make the transformers bigger” approach Yes it’s true. But there is a difference between what’s interesting and what works. deep learning (RNNs, transformers, etc.) is usually old ideas applied at large scale with slight modifications. Proving a model works well at large scale (175B parameters) is a great contribution and measures our progress towards AI. |
|