|
|
|
|
|
by patrickas
2101 days ago
|
|
As far as I understand in this specific case yes. The whole schtick of GPT-3 is the insight that we do not need to come up with a better algorithm than GPT-2.
If we dramatically increase the number of parameters without changing the architecture/algorithm its capabilities will actually dramatically increase instead of reaching a plateau like it was expected by some. Edit:
Source https://www.gwern.net/newsletter/2020/05#gpt-3 "To the surprise of most (including myself), this vast increase in size did not run into diminishing or negative returns, as many expected, but the benefits of scale continued to happen as forecasted by OpenAI." |
|