|
|
|
|
|
by choilive
92 days ago
|
|
I speculate that they are hitting the reticle limit for models not much bigger than this. Judging by the size of the chip in their demonstrator for a 8B model I'm sure they know this already. To scale this up means splitting up large models into multiple chips (layer or tensor parallelism). And that gets quite complicated quite quickly and you'll need really high bandwdith/low latency interconnects. Still a REALLY interesting approach with a ton of potential despite the unstated challenges. |
|