Hacker News new | ask | show | jobs
by pizza 613 days ago
Was just going to mention that it seems that it should be possible to make a Flash Attention version of this algorithm and was pleasantly surprised to see they already included an implementation of one :)