|
|
|
|
|
by int_19h
1213 days ago
|
|
The biggest problem is that scaling is non-linear. The returns might well be non-diminishing wrt model size, but if we have to throw N^2 hardware at it to make it (best-case) 2N better, we'll still hit the limit pretty quickly. |
|