Hacker News new | ask | show | jobs
by jacobgorm 645 days ago
I was part of a startup called Grazper that did the same thing for CNNs in 2016, using FPGAs. I left to found my own thing after realizing that new better architectures, SqueezeNet followed by MobileNets, could run even faster than our ternary nets on off-the-shelf hardware. I’d worry that a similar development might happen in the LLMs space.
1 comments

It's always possible, but transformers have been around since 2017 and don't seem to be going anywhere. I was bullish on Mamba and researched extended context for structured state-space models at Dartmouth. However, no one cared. The bet we're taking is that Transformers will dominate for at least a few more years, but our bet could be wrong.