Hacker News new | ask | show | jobs
by regularfry 1211 days ago
It's interesting because of the scaling law. No matter how much acceleration matrix multiplication gets on an electronic circuit, its energy usage is always going to scale as O(n^2.something). The implication here is that the energy usage by doing it optically is O(1). At least, that's how I read "We found that the optical energy per multiply-accumulate (MAC) scales as 1/d where d is the Transformer width". The best you can hope for is to stay on the right side of the constant factors (which, currently, the GPU world is).