|
|
|
|
|
by txus
509 days ago
|
|
At Manifest AI we have just released our open-source CUDA kernels to implement Symmetric Power Transformers, as described in our paper from back in August: https://manifestai.com/articles/symmetric-power-transformers... Since this is a variant of a linear attention, you get linear cost when training (as opposed to quadratic in regular attention), and constant when doing inference. This is especially attractive for longer contexts! Have a look and play with it -- and of course contributions are very welcome! It's an early alpha! |
|