Hacker News new | ask | show | jobs
by barbariangrunge 1153 days ago
> At 64,000 tokens, the authors relate, "Hyena speed-ups reach 100x" -- a one-hundred-fold performance improvement.

That’s quite the difference

1 comments

Classic attention is quadratic in context length and faster alternatives seem to not perform as well, wonder how Hyena is in comparison to linear attention algorithms.