|
|
|
|
|
by jeeceebees
2212 days ago
|
|
I think the larger models get, the more incentive there is for researchers to look into pruning/distilling them for practical use. GPT-1,2,3 et al. have all shown that larger is better. While in the short term this means people will simply throw larger and larger clusters at the problem, in the longer term there needs to be inovation in making it more efficient on the clusters we have (as even the cloud has limits). I think sheer parameter count is an important part of the equation in general intelligence, so it's important that there are labs that work on scaling up promising leads to trillions of parameters on top of labs thinking of new promising directions. |
|