Hacker News new | ask | show | jobs
by woadwarrior01 455 days ago
dot-product attention is the biggest barrier. This is why there are so many attempts to linearize it.
1 comments

that fail... linearization is a bad idea. But plenty of other optimizations are done