|
|
|
|
|
by varelse
2361 days ago
|
|
I am far more excited by the underlying Wafer Scale Integration moonshot than I am by any AI benchmarks here. I know it's trendy to think there can only be one w/r to the AI Iron Throne but nope, not the case, everyone is writing bespoke code in production where the money is made. Well, almost everyone, Amazon seems to be the odd duck but they're a bunch of cheapskate thought leaders anyway (except for their offers to junior engineers in their desperate hail mary attempt to catch up with FAIR and DeepMind, but... I... digress...). Which is to say that graphs written to run specifically on Cerebras's giant chip will smash deep learning's speed barrier for graphs written to run best on Cerebras's giant chip. And that's great, but it won't be every graph, there is no free lunch. Hear me now, believe me later(tm). But if we can cut the cost of interconnect by putting a figurative datacenter's worth of processors on a chip, that's genuinely interesting, and it has applications far beyond the multiplies and adds of AI. But be very wary of anyone wielding the term "sparse" for it is a massively overloaded definition and every single one of those definitions is a beautiful and unique snowflake w/r to efficient execution on bespoke HW. |
|
Also, is this something that will likely scale up, or will this style of design hit a wall(power dissipation?) faster than, say, silicon-interconnect fabric?
Time will tell if this is the new path forward or just a curious footnote in the history of semiconductors.