Hacker News new | ask | show | jobs
by numeri 1056 days ago
FlashAttention is an amazing improvement over the previous state of the art. The others are still highly experimental, but seem like they'll at least contribute significant knowledge to whatever ends up surpassing the Transformer, (assuming something does).