Y
Hacker News
new
|
ask
|
show
|
jobs
by
sottol
1153 days ago
Classic attention is quadratic in context length and faster alternatives seem to not perform as well, wonder how Hyena is in comparison to linear attention algorithms.