Y
Hacker News
new
|
ask
|
show
|
jobs
by
nwlieb
1162 days ago
The runtime is quadratic for a given context size, although it seems like there is some progress on this front
https://gwern.net/note/attention