Hacker News new | ask | show | jobs
by GeorgeTirebiter 621 days ago
Cerebras is well-known in the AI chip market. They make chips that are an entire wafer.

https://spectrum.ieee.org/cerebras-chip-cs3

6 comments

Cerebrus made a great (now deleted) video on the whole computer hosting the wafer: https://web.archive.org/web/20230812020202/https://www.youtu...

It’s fascinating.

This is a great video, thank you for sharing. My favorite part:

"...next we have this rubber sheet, which is very clever, and very patented!"

TIL - web archive saves youtube videos

Wow 200k amps in a chip. Whole thing looks like an early computer from 50s.

Yep! Them, SambaNova, and Groq are super exciting mid-late stage startups imo.
Shhhhh, stop telling the normies about the future!

And especially don't tell them to start looking into who "sovereign clouds" actually are!

Interesting that they’ve scaled on-chip memory sublinearly with the growth of transistors between their generations, I would’ve thought they would try to bump that number up. Maybe it’s not a major bottleneck for their training runs?
SRAM is scaling significantly more slowly than logic in recent process nodes.
Ahh that explains it, thanks. Seems like a potentially large problem given their strategy.
They could use something like GCRAM[1] to double capacity if they had to...but it's not clear how much worse performance would be.

[1]https://raaam-tech.com/products/

The performance doesn't look great (yet). See Fig. 7

https://www.eng.biu.ac.il/fishale/files/2020/12/A-1-Mbit-Ful...

Cerebras runs at 1.1 GHz[1], and this was a much earlier design on 16nm so it might be a good fit by now. Their TSMC 5 nm version is scheduled for early 2025.[2]

[1]https://cerebras.ai/blog/cerebras-architecture-deep-dive-fir...

[2]https://www.eenewseurope.com/en/raaam-signs-lead-licensee-fo...

I'd bet that making a chip the size of the waver has the benefit on not losing any silicon to dicing the wafer up like a desktop or GPU chips coming from a wafer. Major downside is you need to either have a massive x and Y exposure size or break the wafer into smaller exposures which means your still needing to focus on alignment between the steps, and if a defect can't be corrected then is that wafer just scrap?
They fuse off sections of the wafer with defects just like other manufacturers do in monolithic CPUs (as opposed to chiplets like AMD).
Making larger monolithic silicon doesn't get 2x as expensive to get 2x as large. Bigger silicon is massively more expensive. I'm not sure that making each piece require a large chunk of perfect wafer is a fantastic idea, especially when you're looking to unseat juggernauts who have a great deal of experience making high quality product already.
0.5% overheads for defects. You are not correct.
How does one cool that!? Heck power it...