|
|
|
|
|
by famouswaffles
618 days ago
|
|
It's necessary for arbitrary information processing if you can forget and have no way to "unforget". A model can decide to forget something that turns out to be important for some future prediction. A human can go back and re-read/listen etc, A transformer is always re-reading but a RNN can't and is fucked. |
|