Hacker News new | ask | show | jobs
by dataangel 1056 days ago
are those good or bad
1 comments

FlashAttention is an amazing improvement over the previous state of the art. The others are still highly experimental, but seem like they'll at least contribute significant knowledge to whatever ends up surpassing the Transformer, (assuming something does).