Hacker News new | ask | show | jobs
by oofbey 167 days ago
I think that was pretty clear even when this paper came out - even if you could find these sub networks they wouldn’t be faster on real hardware. Never thought much of this paper, but it sure did get a lot of people excited.
3 comments

It was exciting because of what it means regarding how a model learns, regardless on whether or not its commercially applicable.
(Cerebras is real hardware.)
It is real in that it exists. It is not real in the sense that almost nobody has access to them. Unless you work at one of the handful of organizations with their hardware, it’s not a practical reality.
how long will that be the case?
They have a strange business model. Their chips are massive. So they necessarily only sell them to large customers. Also because of the way they’re built (entire wafer is a single chip) no two chips will be the same. Normally imperfections in the manufacturing result in some parts of the wafer being rejected and other binned as fast or slow chips. If you use the whole wafer you get what you get. So it’s necessarily a strange platform to work with - every device is slightly different.
At least for the foreseeable future (next 50 years say).
i saw how it nerdsniped an extremely capable faculty member