Hacker News new | ask | show | jobs
by Mehvix 237 days ago
>None of that is specialized to run only transformers at this point

isn't this what [etched](https://www.etched.com/) is doing?

1 comments

Only being able to run transformers is a silly concept, because attention consists of two matrix multiplications, which are the standard operation in feed forward and convolutional layers. Basically, you get transformers for free.
devil is in the details