Y
Hacker News
new
|
ask
|
show
|
jobs
by
cgdl
336 days ago
I'd say llm inference requires both memory capacity and bandwidth. Cerebras provides bandwidth with on-chip SRAM, but not capacity (an entire wafer has only 44GB SRAM).