|
|
|
|
|
by phkahler
618 days ago
|
|
>> A user of an LLM might give the model some long text and then say "Translate this into German please". A Transformer can look back at its whole history. Which isn't necessary. If you say "translate the following to german." Instead, all it needs is to remember the task at hand and a much smaller amount of recent input. Well, and the ability to output in parallel with processing input. |
|
A model can decide to forget something that turns out to be important for some future prediction. A human can go back and re-read/listen etc, A transformer is always re-reading but a RNN can't and is fucked.