|
|
|
|
|
by imtringued
782 days ago
|
|
Something that appears to be hardly known is that the transformer architecture needs to become more compute bound. Inventing a machine learning architecture which is FLOPs heavy instead of bandwidth heavy would be a good start. It could be as simple as using a CNN instead of a V matrix. Yes, this makes the architecture less efficient, but it also makes it easier for an accelerator to speed it up, since CNNs tend to be compute bound. |
|