|
|
|
|
|
by imjonse
618 days ago
|
|
To their credit, the authors (Y. Bengio among them) end the paper with the question, not suggesting they know the answer. These models are very small even by academic standards so any finding would not necessarily extend to current LLM scales. The main conclusion is that RNN class networks can be trained as efficiently as modern alternatives but the resulting performance is only competitive at small scale. |
|
Emphasis on not necessarily.
>> The main conclusion is that RNN class networks can be trained as efficiently as modern alternatives but the resulting performance is only competitive at small scale.
Shouldn't the conclusion be "the resulting competitive performance has only been confirmed at small scale"?