|
|
|
|
|
by kristjansson
130 days ago
|
|
> self-attention is efficiently computable to arbitrary precision with constant cost per token This paper at least aspires to reproduce 'true' attention, which distinguishes it from many of the others. TBD if its successful in that. |
|