Power Attention: Efficient CUDA Kernels for Symmetric Power Transformers

At Manifest AI we have just released our open-source CUDA kernels to implement Symmetric Power Transformers, as described in our paper from back in August:

https://manifestai.com/articles/symmetric-power-transformers...

Since this is a variant of a linear attention, you get linear cost when training (as opposed to quadratic in regular attention), and constant when doing inference. This is especially attractive for longer contexts!

Have a look and play with it -- and of course contributions are very welcome! It's an early alpha!