|
|
|
|
|
by anon291
808 days ago
|
|
> I view this recursion as more of a strength than weakness Sure, it's a strength given that transformers are currently limited by compute budget, but theoretically, if we were to have a way to overcome this, it seems obvious to me that transformer's 'one-shot' ability makes them better. That being said the recursive aspect you're referencing can be built into a transformer as well. This is a sampling and training problem. |
|